bioconductor v3.9.0 BiovizBase
The biovizBase package is designed to provide a set of
Link to this section Summary
Functions
CRC
GC content computation for BSgenome
Adding disjoint levels to a GenomicRanges object
Utils for parsing (un)evaluated arguments
biovizBase is a package which provides utilities and color scheme...
Color blind safe palette generator
Checking if string contains letters or not
crc1.GeRL
Fetching GRanges from various data source
Subset of RNA editing sites in hg19...
Estimation of Coverage
Transform GRangesList to GRanges
Gene symbols with position...
Color scheme getter for biological data
Get formals from functions
get gaps for a stepping transformed GRanges object
Get ideogram information
Get ideogram.
Get scale information from a GRanges
Hg19 ideogram without cytoband information...
Hg19 ideogram with cytoband information...
ideogram without cytoband information
ideogram with cytoband information
Ideogram checking
Utils for Splicing Summary
Simple ideogram checking
parse x and y label information from a specific object
Estimated max gaps
mold data into data.frame
Summarize reads for certain region
Mismatch summary
Show colors
get x scale breaks and labels
Show colors
Shrinkage function
split a GRanges by formula
strip dots around a formula variables
Subset list of arguments by functions
Transform GRanges to different coordinates and layout
Transform GRanges with New Coordinates
Link to this section Functions
CRC()
CRC
Description
Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion sample data set
Usage
data(CRC)
Details
The data used in this study is from this a paper http://www.nature.com/ng/journal/v43/n10/full/ng.936.html.
Examples
data(CRC)
GCcontent()
GC content computation for BSgenome
Description
Compute GC content in a certain region of a BSgenome object
Usage
GCcontent(obj, ..., view.width, as.prob = TRUE)
Arguments
Argument | Description |
---|---|
obj | BSgenome object |
... | Arguments passed to getSeq method for BSgenome package. |
view.width | Passed to letterFrequencyInSlidingView , the constant (e.g. 35, 48, 1000) size of the "window" to slide along obj . The specified letters are tabulated in each window of length view.width . The rows of the result (see value) correspond to the various windows. |
as.prob | If TRUE return percentage of GC content, otherwise return counts. |
Details
GC content is an interesting variable may be related to various
biological questions. So we need a way to compute GC content in a
certain region of a reference genome. GCcontent function is a
wrapper around getSeq function in BSgenome package and
letterFrequency
, letterFrequencyInSlidingView
in Biostrings package.
if the view.width
is specified, the GC content will be computed
in the sliding view
Value
Numeric value indicate count or percentage
Author
Tengfei Yin
Examples
library(BSgenome.Hsapiens.UCSC.hg19)
GCcontent(Hsapiens, GRanges("chr1", IRanges(1e6, 1e6 + 1000)))
addStepping_method()
Adding disjoint levels to a GenomicRanges object
Description
Adding disjoint levels to a GenomicRanges object
Usage
list(list("addStepping"), list("GenomicRanges"))(obj, group.name, extend.size = 0,
fix = "center",
group.selfish = TRUE)
Arguments
Argument | Description |
---|---|
obj | A GenomicRanges object |
group.name | Column name in the elementMetadata which specify the grouping information of all the entries. If provided, this will make sure all intervals belong to the same group will try to be on the same level and nothing falls in between. |
extend.size | Adding invisible buffered region to the GenomicRanges object, if it's 10, then adding 5 at both end. This make the close neighbors assigned to the different levels and make your eyes easy to identify. |
fix | "start", "end", or "center"(default) passed to resize , denoting what to use as an anchor for each element of obj, and add extend.size to it's current width. |
group.selfish | Passed to addStepping , control whether to show each group as unique level or not. If set to FALSE , if two groups are not overlapped with each other, they will probably be layout in the same level to save space. |
Details
This is a tricky question, for example, pair-end RNA-seq data could be represented as two set of GenomicRanges object, one indicates the read, one indicates the junction. At the same time, we need to make sure pair-ended read are shown on the same level, and nothing falls in between. For better visualization of the data, we may hope to add invisible extended buffer to the reads, so closely neighbored reads will be on the different levels.
Value
A modified GenmicRanges object with stepping
as one column.
Author
Tengfei Yin
Examples
library(GenomicRanges)
set.seed(1)
N <- 500
## sample GRanges
gr <- GRanges(seqnames =
sample(c("chr1", "chr2", "chr3", "chrX", "chrY"),
size = N, replace = TRUE),
IRanges(
start = sample(1:300, size = N, replace = TRUE),
width = sample(70:75, size = N,replace = TRUE)),
strand = sample(c("+", "-", "*"), size = N,
replace = TRUE),
value = rnorm(N, 10, 3), score = rnorm(N, 100, 30),
group = sample(c("Normal", "Tumor"),
size = N, replace = TRUE),
pair = sample(letters, size = N,
replace = TRUE))
## grouping and extending
head(addStepping(gr))
head(addStepping(gr, group.name = "pair"))
gr.close <- GRanges(c("chr1", "chr1"), IRanges(c(10, 20), width = 9))
addStepping(gr.close)
addStepping(gr.close, extend.size = 5)
aes_utils()
Utils for parsing (un)evaluated arguments
Description
Utilities for parsing (un)evaluated arguments
Usage
parseArgsForAes(args)
parseArgsForNonAes(args)
Arguments
Argument | Description |
---|---|
args | arguments list. |
Value
For parseArgsForAes
return a unevaluated arguments.
For parseArgsNonForAes
return a evaluated/quoted arguments.
Author
Tengfei Yin
Examples
args <- alist(a = color, b = "b")
# parseArgsForAes(args)
biovizBase_package()
biovizBase is a package which provides utilities and color scheme...
Description
biovizBase is a package which provides utilities and color scheme for higher level graphic package which aim to visualize biological data especially genetic data.
Details
This package provides default color scheme for nucleotide, strand, amino acid, try to pass colorblind checking as possible as we can. And also provide giemsa stain result color scheme used to show cytoband. This package also provides utilites to manipulate and summarize raw data to get them ready to be visualized.
colorBlindSafePal()
Color blind safe palette generator
Description
This function help users create a function based on specified color blind safe palette. And the returned function could be used for color generation.
Usage
genDichromatPalInfo()
genBrewerBlindPalInfo()
genBlindPalInfo()
colorBlindSafePal(palette)
blind.pal.info
brewer.pal.blind.info
dichromat.pal.blind.info
Arguments
Argument | Description |
---|---|
palette | A index numeric value or character. Please see blind.pal.info , the palette could be "pal_id" or names the row in which users could specify the palette you want to use. |
Details
We get those color-blind safe palette based on http://colorbrewer2.org/ and http://geography.uoregon.edu/datagraphics/color_scales.htm those color are implemented in two packages, RColorBrewer and dichromat. But RColorBrewer doesn't provide subset of color-blind safe palette info. And dichromat doesn't group color based on "quality", "sequential" and "divergent", so we pick those color manually and create a combined palette, blind.pal.info.
colorBlindSafePal
will return a function, this function will
take two arguments, n
and repeatable
. if n is smaller than
3(n >= 3 is required by RColorBrewer), we use 3 instead and return a
color vector. If n is over the maxcolors
column in blind.pal.info, we
use maxcolors
instead when repeatable
set to FALSE
, if
repeatable
set to TRUE
we repeat the color of all the
allowed colors(length equals maxcolors
) in the same order. This has
special case in certain graphics which is always displayed side by
side and don't worry about the repeated colors being neighbors.
genBrewerBlindPalInfo
return brewer.pal.blind.info
data
frame containing all color-blind safe palettes from
brewer.pal.info
defined in RColorBrewer, but it's not only just
subset of it, it also changes some maxcolors
info.
genDichromatPalInfo
return dichromat.pal.blind.info
data
frame.
genBlindPalInfo
return blind.pal.info
data frame.
Value
A color generating function with arguments n
and
repeatable
. n
specifying how many different discrete colors
you want to get from them palette, and if repeatable turned on and set
to TRUE, you can specify n even larger than maximum color. The color
will be repeated following the same order.
Author
Tengfei Yin
Examples
library(scales)
## brewer subse of only color blind palette
brewer.pal.blind.info
genBrewerBlindPalInfo()
## dichromat info
dichromat.pal.blind.info
genDichromatPalInfo()
## all color blind palette, adding id/pkg.
blind.pal.info
## with no parameters, just return blind.pal.info
colorBlindSafePal()
mypal <- colorBlindSafePal(20)
## or pass character name
mypal <- colorBlindSafePal("Set2")
mypal12 <- colorBlindSafePal(22)
show_col(mypal(12, repeatable = FALSE)) # warning
show_col(mypal(11, repeatable = TRUE)) # no warning, and repeat
show_col(mypal12(12))
containLetters()
Checking if string contains letters or not
Description
Test if a string contain any letters
Usage
containLetters(obj, all=FALSE)
Arguments
Argument | Description |
---|---|
obj | String |
all | If set to FALSE, return TRUE when any letters appears; if all is set to TRUE, return TRUE only when the string is composed of just letters. |
Details
Useful when processing/sorting seqnames
Value
Logical value
Author
tengfei
Examples
containLetters("XYZ123")
containLetters("XYZ123", TRUE)
crc1GeRL()
crc1.GeRL
Description
CRC-1 mutation and structural rearrangment
Usage
data(crc1.GeRL)
Details
GenomicRangesList contains somatic mutation, reaarangment etc for a tumor sample, please check the reference for detials.
References
Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion
Examples
data(crc1.GeRL)
crc1.GeRL
crunch_method()
Fetching GRanges from various data source
Description
Fetching Granges from various data source, currently supported objects are TxDb, EnsDb, GAlignments and BamFile.
Usage
list(list("crunch"), list("TxDb"))(obj, which, columns = c("tx_id", "tx_name","gene_id"),
type = c("all", "reduce"), truncate.gaps = FALSE,
truncate.fun = NULL, ratio = 0.0025)
list(list("crunch"), list("EnsDb"))(obj, which, columns = c("tx_id", "gene_name","gene_id"),
type = c("all", "reduce"), truncate.gaps = FALSE,
truncate.fun = NULL, ratio = 0.0025)
list(list("crunch"), list("GAlignments"))(obj, which, truncate.gaps = FALSE,
truncate.fun = NULL, ratio = 0.0025)
list(list("crunch"), list("BamFile"))(obj, which, ..., type = c("gapped.pair", "raw", "all"),
truncate.gaps = FALSE, truncate.fun = NULL, ratio = 0.0025)
Arguments
Argument | Description |
---|---|
obj | supported objects are TxDb , GAlignments and BamFile . |
which | GRanges object. For TxDb object, could aslo be a list. For EnsDb , it can also be a single filter object extending AnnotationFilter-class , an AnnotationFilterList combining several such objects or a filter expression in form of a formula (see AnnotationFilter for examples). |
columns | columns to include in the output. |
type | default 'all' is to show the full model, 'reduce' is to show a single model. |
truncate.gaps | logical value, default FALSE . Whether to truncate gaps or not. |
truncate.fun | shrinkage function. |
ratio | numeric value, shrinking ratio. |
... | arguments passed to function readGAlignments . |
Value
GRanges object.
Author
Tengfei Yin
Examples
library(biovizBase)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
data(genesymbol, package = "biovizBase")
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
obj <- txdb
temp <- crunch(txdb, which = genesymbol["BRCA1"], type = "all")
temp <- crunch(txdb, which = genesymbol["BRCA1"], type = "reduce")
## Fetching data from a EnsDb object.
library(EnsDb.Hsapiens.v75)
edb <- EnsDb.Hsapiens.v75
gr <- genesymbol["BRCA1"]
seqlevels(gr) <- sub(seqlevels(gr), pattern="chr", replacement="")
temp <- crunch(edb, which = gr)
## Alternatively, use the GenenameFilter from the AnnotationFilter package:
temp <- crunch(edb, which = GenenameFilter("BRCA1"))
## Or a filter expression
temp <- crunch(edb, which = ~ genename == "BRCA1")
darned_hg19_subset500()
Subset of RNA editing sites in hg19...
Description
Subset of RNA editing sites in hg19
Usage
data(darned_hg19_subset500)
Details
This data set provides a subset(500 sites only) of hg19 RNA editing sites, and originally from DARNED http://darned.ucc.ie/ for the hg19 assembly.
Examples
data(darned_hg19_subset500)
darned_hg19_subset500
estimateCoverage_method()
Estimation of Coverage
Description
Estimation of Coverage
Usage
list(list("estimateCoverage"), list("BamFile"))(x, maxBinSize = 2^14)
Arguments
Argument | Description |
---|---|
x | A BamFile object. |
maxBinSize | Max bin size. |
Value
A GRanges
object.
Author
Michael Lawrence
flatGrl()
Transform GRangesList to GRanges
Description
Transform GRangesList to GRanges.
Usage
flatGrl(object, indName = "grl_name")
Arguments
Argument | Description |
---|---|
object | a GRangesList object. |
indName | column named by 'indName' that groups the records by their original element in 'object'. |
Details
This method is different from default stack
, it integrate
elementMetadata of GRangesList
to the final coerced GRanges
.
Value
A GRanges
object.
Author
Tengfei Yin
Examples
library(GenomicRanges)
gr1 <- GRanges("chr1", IRanges(1:10, width = 5))
gr2 <- GRanges("chr2", IRanges(1:10, width = 5))
obj <- GRangesList(gr1, gr2)
values(obj) <- data.frame(a = 1:2, b = letters[1:2])
stack(obj)
flatGrl(obj)
genesymbol()
Gene symbols with position...
Description
Gene symbols with position
Usage
data(genesymbol)
Details
This data set provides genen symbols in human with position and starnd information, stored as GRanges object.
Examples
data(genesymbol)
head(genesymbol)
genesymbol["RBM17"]
getBioColor()
Color scheme getter for biological data
Description
This function tries to get default color scheme either from fixed palette or options for specified data set, usually just biological data.
Usage
getBioColor(type = c("DNA_BASES_N", "DNA_BASES", "DNA_ALPHABET",
"RNA_BASES_N", "RNA_BASES", "RNA_ALPHABET",
"IUPAC_CODE_MAP", "AMINO_ACID_CODE", "AA_ALPHABET",
"STRAND", "CYTOBAND"),
source = c("option", "default"))
Arguments
Argument | Description |
---|---|
type | Color set based on which you want to get. |
source | "option" tries to get color scheme from Options . This allow user to edit the color globally. "default" gets fixed color scheme. |
Details
Most data set specified in the type argument are defined in Biostrings package, such as "DNA_BASES", "DNA_ALPHABET", "RNA_BASES", "RNA_ALPHABET", "IUPAC_CODE_MAP", "AMINO_ACID_CODE", "AA_ALPHABET", please check the manual for more details.
"DNA_BASES_N" is just "DNA_BASES" with extra "N" used in certain
cases, like the result from applyPileup
in Rsamtools package.
We start with the five most used nucleotides, A,T,C,G,N
, In
genetics, GC-content
usually has special biological
significance because GC pair is bound by three hydrogen bonds instead
of two like AT pairs. So it has higher thermostability which could
result in different significance, like higher annealing temperature in
PCR. So we hope to choose warm color for G,C
and cold color for
A,T
, and a color in between to represent N
. They are
chosen from diverging color set of color brewer
. So we should
be able to easily tell the GC enriched region. This set of color also
passed vischeck for colorblind people.
In GRanges
object, we have strand
which contains three
levels, +, -, *. We are using a qualitative color set from Color
. This color set pass the colorblind test. It should be a safe
color set to use to color strand.
For most default color scheme if they are under 18, we are trying to use package dichromat to set color for color blind people. But for data set that contains more than 18 objects, it's not possible to assign colorblind-safe color to them anymore. So we need to repeat some color. It should not matter too much, because even normal people cannot tell the difference anymore.
Here are the definition for the data sets.
list("
", " ", list(list("DNA_BASES"), list("
", " Contains ", list("A,C,T,G"), ".
", " ")), "
", " ", list(list("DNA_ALPHABET"), list("
", " This alphabet contains all letters from the IUPAC Extended Genetic
", " Alphabet (see "?IUPAC_CODE_MAP") + the gap ("-") and the hard
", " masking ("+") letters.
", " ")), "
", " ", list(list("DNA_BASES_N"), list("
", " Contains ", list("A,C,T,G,N"), "
", " ")), "
", " ", list(list("RNA_BASES_N"),
list("
", " Contains ", list("A,C,U,G,N"), " ", " ")), " ", " ", list(list("RNA_BASES"), list(" ", " Contains ", list("A,C,T,G"), " ", " ")), " ", " ", list(list("RNA_ALPHABET"), list(" ", " This alphabet contains all letters from the IUPAC Extended Genetic ", " Alphabet (see ?IUPAC_CODE_MAP) where "T" is replaced by "U" ", " + the gap ("-") and the hard masking ("+") letters. ", " ")), " ", " ", list(list("IUPAC_CODE_MAP"), list(" ",
" The "IUPAC_CODE_MAP" named character vector contains the mapping
", " from the IUPAC nucleotide ambiguity codes to their meaning.
", " ")), "
", " ", list(list("AMINO_ACID_CODE"), list("
", " Single-Letter Amino Acid Code (see "?AMINO_ACID_CODE").
", " ")), "
", " ", list(list("AA_ALPHABET"), list("
", " This alphabet contains all letters from the Single-Letter Amino
", " Acid Code (see "?AMINO_ACID_CODE") + the stop ("*"), the gap
", " ("-") and the hard masking ("+") letters
",
" ")), "
", " ", list(list("STRAND"), list(" ", " Contains ", list(""+", "-", "*""), " ", " ")), " ", " ", list(list("CYTOBAND"), list(" ", " Contains giemsa stain results:", list("gneg, gpos25, gpos50, gpos75, ", " gpos100, gvar, stalk, acen"), ". Color defined in package geneplotter. ", " ")), " ", " ")
Value
A character of vector contains color(rgb), and the names of the vector
is originally from the name of different data set. e.g. for DNA_BASES,
it's just A,C,T,G
. This allow users to get color for a vector
of specified names. Please see the examples below.
Author
Tengfei Yin
Examples
opts <- getOption("biovizBase")
opts$DNABasesNColor[1] <- "red"
options(biovizBase = opts)
## get from option(default)
getBioColor("DNA_BASES_N")
## get default fixed color
getBioColor("DNA_BASES_N", source = "default")
seqs <- c("A", "C", "T", "G", "G", "G", "C")
## get colors for a sequence.
getBioColor("DNA_BASES_N")[seqs]
getFormalNames()
Get formals from functions
Description
Get formals from functions, used for dispatching arguments inside.
Usage
getFormalNames(..., remove.dots = TRUE)
Arguments
Argument | Description |
---|---|
... | functions. |
remove.dots | logical value, indicate remove dots in formals or not, default is TRUE . |
Value
A character vector for formal arguments.
Author
Tengfei Yin
Examples
getFormalNames(plot, sapply)
getGaps()
get gaps for a stepping transformed GRanges object
Description
Get gaps for a stepping transformed GRanges object, for visualization purpose. So a extra "stepping" column is required. Please see details below for motivation.
Usage
list(list("getGaps"), list("GRanges"))(obj, group.name = NULL, facets = NULL)
Arguments
Argument | Description |
---|---|
obj | A GRanges object. |
group.name | group name, such as transcript ID, this is the group method within each panel of facets and gaps will be computed for each group of intervals. |
facets | formula used for creating graphics, all variables must be present in the data. |
Details
Since faceting is a subset and group process in visualization
stage, some statistical computation need to be taken place after
that. This leaves some computation like computing gaps hard based on
solely GRanges
object. Extra information like facets formula
and group method would help to generate gaps which make sure they are
aligned on the same level and within the same panel for grouped
intervals. facets variables will be added to gaps GRanges
along
with group.name.
Value
A GRanges
object representing gaps and for each row, the
"stepping" column help later visualization and make sure gaps and
intervals they are generated from are showed on the expected place.
Author
Tengfei Yin
Examples
set.seed(1)
N <- 100
library(GenomicRanges)
gr <- GRanges(seqnames =
sample(c("chr1", "chr2", "chr3"),
size = N, replace = TRUE),
IRanges(
start = sample(1:300, size = N, replace = TRUE),
width = sample(70:75, size = N,replace = TRUE)),
strand = sample(c("+", "-", "*"), size = N,
replace = TRUE),
value = rnorm(N, 10, 3), score = rnorm(N, 100, 30),
sample = sample(c("Normal", "Tumor"),
size = N, replace = TRUE),
pair = sample(letters, size = N,
replace = TRUE))
grl <- splitByFacets(gr, sample ~ seqnames)
gr <- unlist(endoapply(grl, addStepping))
gr.gaps <- getGaps(gr, group.name = "stepping", facets = sample ~ seqnames)
getIdeoGR()
Get ideogram information
Description
This function tries to parse ideogram information from
seqlengths
of a GRanges
object. This information is
usually used to plot chromosome background for kaytogram or esitmate
proper lengths of chromosomes from data space for showing overview.
Usage
getIdeoGR(gr)
Arguments
Argument | Description |
---|---|
gr | A GRanges object with or without lengths information. |
Value
A ideogram GRanges
object, each row indicate one single
chromosome, and start with 1 and end with real chromosome length or
estimated lengths.
Author
Tengfei Yin
Examples
library(GenomicRanges)
data("hg19IdeogramCyto", package = "biovizBase")
hg19IdeogramCyto
## without seqlengths, simply reduce
getIdeoGR(hg19IdeogramCyto)
## with seqlengths
gr <- GRanges("chr1", IRanges(1,3))
seqlevels(gr) <- c("chr1", "chr2", "chr3")
nms <- c(100, 200, 300)
names(nms) <- c("chr1", "chr2", "chr3")
seqlengths(gr) <- nms
gr
getIdeoGR(gr)
getIdeogram()
Get ideogram.
Description
Get ideogram w/o cytoband for certain genome
Usage
getIdeogram(genome, subchr, cytobands=TRUE)
Arguments
Argument | Description |
---|---|
genome | Single specie names, which must be one of the result from ucscGenomes()$db . If missing, will invoke a menu for users to choose from. |
subchr | A character vector used to subset the result. |
cytobands | If TRUE, return ideogram with gieStain and name column. If FALSE, simply return the genome information for each chromosome. |
Details
This function require a network connection, it will parse the data on the fly, function is a wrapper of some functionality from rtracklayer package to get certain table like cytoBand, a full table schema could be found http://genome.ucsc.edu/cgi-bin/hgTables in UCSC genome browser.This is useful for visualization of the whole genome or single chromosome, you can see some examples in ggbio package.
Value
A GRanges object.
Author
Tengfei Yin
Examples
hg19IdeogramCyto <- getIdeogram("hg19", cytoband = TRUE)
getScale()
Get scale information from a GRanges
Description
Trying to get scale information from a GRanges
object, used for
circular view for geom "scale".
Usage
getScale(gr, unit = NULL, n = 100, type = c("M", "B", "sci"))
Arguments
Argument | Description |
---|---|
gr | a GRanges object. |
unit | A numeric value for scale unit. Default NULL use argument n to estimate the unit. |
n | Integer value to indicate how many scale ticks to make. |
type | unit types to shown. |
Value
A GRanges
object, with extra column: "type" indicate it's longer
major ticks or shorter minor ticks. "scale.y" indicates y height for
major and minor ticks. Default ratio is 3:1.
Author
Tengfei Yin
hg19Ideogram()
Hg19 ideogram without cytoband information...
Description
Hg19 ideogram without cytoband information
Usage
data(hg19Ideogram)
Details
This data set provides hg19 genome information wihout cytoband information.
Examples
data(hg19Ideogram)
hg19Ideogram
hg19IdeogramCyto()
Hg19 ideogram with cytoband information...
Description
Hg19 ideogram with cytoband information
Usage
data(hg19IdeogramCyto)
Details
This data set provides hg19 genome information with cytoband information.
Examples
data(hg19IdeogramCyto)
hg19IdeogramCyto
ideo()
ideogram without cytoband information
Description
ideogram without cytoband information
Usage
data(ideo)
Details
This data set provides hg19, hg18, mm10, mm9 genome information wihout cytoband information as a lit.
Examples
data(ideo)
ideo
ideoCyto()
ideogram with cytoband information
Description
ideogram with cytoband information
Usage
data(ideoCyto)
Details
This data set provides hg19, hg18, mm10, mm9 genome information with cytoband information as a lit.
Examples
data(ideoCyto)
ideoCyto
isIdeogram()
Ideogram checking
Description
Check if an object is ideogram or not
Usage
isIdeogram(obj)
Arguments
Argument | Description |
---|---|
obj | object |
Details
Simply test if it's the result coming
from getIdeogram
function or not, make sure it's a
GRanges
and with extra column
Value
A logical value to indicate it's a ideogram or not.
Author
Tengfei Yin
Examples
data(hg19IdeogramCyto)
data(hg19Ideogram)
isIdeogram(hg19IdeogramCyto)
isIdeogram(hg19Ideogram)
isMatchedWithModel()
Utils for Splicing Summary
Description
Utilities used for summarizing isoforms
Usage
isJunctionRead(cigar)
isMatchedWithModel(model, gr)
Arguments
Argument | Description |
---|---|
cigar | A CIGAR string vector. |
model | A GRanges object. |
gr | A GRanges object. |
Details
isJunctionRead
simply parsing the CIGAR string to see if there
is "N" in between and return a logical vector of the same length as
cigar parameters, indicate it's junction read or not.
isMatchedWithModel
mapping gr
to model
, and counting overlapped
cases for each row of model, If gr
contains all the read, this
will return a logical vector of the same length as gr
, and
indicate if each read is the support for this model. NOTICE: we only
assume it's a full model, so each model
here is simply one
isoform. So we only treat the gaped reads which only overlapped with two
consecutive exons in model
as one support for it.
Value
Logical vectors.
Author
Tengfei Yin
Examples
library(GenomicAlignments)
bamfile <- system.file("extdata", "SRR027894subRBM17.bam",
package="biovizBase")
## get index of junction read
which(isJunctionRead(cigar(readGAlignments(bamfile))))
##
model <- GRanges("chr1", IRanges(c(10, 20, 30, 40), width = 5))
gr <- GRanges("chr1", IRanges(c(10, 10, 12, 22, 33), c(31, 40, 22, 32,
44)))
isMatchedWithModel(model, gr)
isSimpleIdeogram()
Simple ideogram checking
Description
Check if an object is a simple ideogram or not
Usage
isSimpleIdeogram(obj)
Arguments
Argument | Description |
---|---|
obj | object |
Details
test if it's GRanges or not, doesn't require cytoband information. But it double check to see if there is only one entry per chromosome
Value
A logical value to indicate it's a simple ideogram or not.
Author
tengfei
Examples
data(hg19IdeogramCyto)
data(hg19Ideogram)
isSimpleIdeogram(hg19IdeogramCyto)
isSimpleIdeogram(hg19Ideogram)
labs()
parse x and y label information from a specific object
Description
parse y label information, object specific.
Usage
list(list("getYLab"), list("TxDb"))(obj)
list(list("getXLab"), list("GRanges"))(obj)
list(list("getXLab"), list("GRangesList"))(obj)
list(list("getXLab"), list("GAlignments"))(obj)
Arguments
Argument | Description |
---|---|
obj | A TxDb object. |
Value
a string.
Author
Tengfei Yin
maxGap_method()
Estimated max gaps
Description
Compute an estimated max gap information, which could
be used as max.gap argument in shringkageFun
.
Usage
list(list("maxGap"), list("GenomicRanges"))(obj, ratio = 0.0025)
Arguments
Argument | Description |
---|---|
obj | GenomicRanges object |
ratio | Multiple by the range of the provided gaps as the max gap. |
Details
This function tries to estimate an appropriate max gap to be used for creating a shrinkage function.
Value
A numeric value
Seealso
Author
Tengfei Yin
Examples
require(GenomicRanges)
gr1 <- GRanges("chr1", IRanges(start = c(100, 300, 600),
end = c(200, 400, 800)))
gr2 <- GRanges("chr1", IRanges(start = c(100, 350, 550),
end = c(220, 500, 900)))
gaps.gr <- intersect(gaps(gr1, start = min(start(gr1))),
gaps(gr2, start = min(start(gr2))))
shrink.fun <- shrinkageFun(gaps.gr, max.gap = maxGap(gaps.gr))
shrink.fun(gr1)
shrink.fun(gr2)
mold_method()
mold data into data.frame
Description
mold data into data.frame usued for visualization, so it's sometimes not as simple as coersion, for example, we add midpoint varialbe for mapping.
Usage
list(list("mold"), list("IRanges"))(data)
list(list("mold"), list("GRanges"))(data)
list(list("mold"), list("GRangesList"))(data,indName = "grl_name")
list(list("mold"), list("Seqinfo"))(data)
list(list("mold"), list("matrix"))(data)
list(list("mold"), list("eSet"))(data)
list(list("mold"), list("ExpressionSet"))(data)
list(list("mold"), list("RangedSummarizedExperiment"))(data, assay.id = 1)
list(list("mold"), list("Views"))(data)
list(list("mold"), list("Rle"))(data)
list(list("mold"), list("RleList"))(data)
list(list("mold"), list("VRanges"))(data)
Arguments
Argument | Description |
---|---|
data | data to be mold into data.frame with additional column that helped mapping.For example we add 'midpoint' variable to a ranged converted data.frame . |
indName | when collapsing a GRangesList , list names are aggregated into extra column named by this parameter. |
assay.id | define the assay id you want to convert into a data.frame . |
Value
a data.frame object.
Author
Tengfei Yin
pileupAsGRanges()
Summarize reads for certain region
Description
This function summarize reads from bam files for nucleotides on single base unit in a given region, this allows the downstream mismatch summary analysis.
Usage
pileupAsGRanges(bams, regions, DNABases=c("A", "C", "G", "T", "N"), ...)
Arguments
Argument | Description |
---|---|
bams | A character which specify the bam file path. |
regions | A GRanges object specifying the region to be summarized. This passed to which arguments in ApplyPileupsParam . |
DNABases | Nucleotide type you want to summarize in the result and in specified order. It must be one or more of A,C,G,T,N. |
... | Extra parameters passed to ApplyPileupsParam . |
Details
It's a wrapper around applyPileup
function in Rsamtools
package, more detailed control could be found under manual of
ApplyPileupsParam function in Rsamtools. pileupAsGRanges
function
return a GRanges object which including summary of nucleotides,
depth, bam file path. This object could be read directly into
pileupGRangesAsVariantTable
function for mismatch
summary.
Value
A GRanges object, each row is one single base unit. and
elementMetadata contains summary about this position about all
nucleotides specified by DNABases. and depth
for total
reads, bam
for file path.
Author
Michael Lawrence, Tengfei Yin
Examples
library(Rsamtools)
data(genesymbol)
library(BSgenome.Hsapiens.UCSC.hg19)
bamfile <- system.file("extdata", "SRR027894subRBM17.bam", package="biovizBase")
test <- pileupAsGRanges(bamfile, region = genesymbol["RBM17"])
test.match <- pileupGRangesAsVariantTable(test, Hsapiens)
head(test[,-7])
head(test.match[,-5])
pileupGRangesAsVariantTable()
Mismatch summary
Description
Compare to reference genome and compute mismatch summary for certain region of reads.
Usage
pileupGRangesAsVariantTable(gr, genome, DNABases=c("A", "C", "G", "T", "N"))
Arguments
Argument | Description |
---|---|
gr | A GRanges object, with nucleotides summary, each base take one column in elementMetadata or user can simply passed the returned result from pileupAsGRanges function to this function. |
genome | BSgenome object, need to be the reference genome. |
DNABases | Nucleotide types contained in passed GRanges object. Default is A/C/G/T/N, it tries to match the column names in elementMetadata to those default nucleotides. And treat the matched column as base names. |
Details
User need to make sure to pass the right reference genome to this function to get the right summary. This function drop the position has no reads and only keep the region with coverage in the summary. The result could be used to show stacked barchart for mismatch summary.
Value
A GRanges object. Containing the following elementMetadata
list("ref") list("Nucleotide in reference genome.")
list("read") list("Nucleotide contained in the reads at particular position, ", "if multiple nucleotide, either matched or unmatched are found, they will be ", "summarized in different rows.")
list("count") list("Count for read column.")
list("match") list("Logical value, whether matched to reference genome or not")
list("bam") list("Character indicate bam file path.")
Author
Michael Lawrence, Tengfei Yin
Examples
library(Rsamtools)
data(genesymbol)
library(BSgenome.Hsapiens.UCSC.hg19)
bamfile <- system.file("extdata", "SRR027894subRBM17.bam", package="biovizBase")
test <- pileupAsGRanges(bamfile, region = genesymbol["RBM17"])
test.match <- pileupGRangesAsVariantTable(test, Hsapiens)
head(test[,-7])
head(test.match[,-5])
plotColorLegend()
Show colors
Description
Plot color legend, simple way to check your default color scheme
Usage
plotColorLegend(colors, labels, title)
Arguments
Argument | Description |
---|---|
colors | A character vector of colors |
labels | Labels to put aside colors, if missing , use names of the colors character vector. |
title | Title for the color legend. |
Details
Show color sheme as a legend style, labels
Value
A graphic device showing color legend.
Author
Tengfei Yin
Examples
cols <- getOption("biovizBase")$baseColor
plotColorLegend(cols, title = "strand legend")
scale()
get x scale breaks and labels
Description
get x scale breaks and labels for GRanges with different coordintes(currently only "truncate_gaps" and "genome" supported).
Usage
list(list("getXScale"), list("GRanges"))(obj, type = c("default", "all", "left", "right"))
Arguments
Argument | Description |
---|---|
obj | a GRanges object. "coord" in metadata shows proper coordinates transformation for this object. |
type | types of labels for transformed data. |
Value
list of breaks and labels.
Author
Tengfei Yin
Examples
library(GenomicRanges)
gr1 <- GRanges("chr1", IRanges(start = c(100, 300, 600),
end = c(200, 400, 800)))
shrink.fun1 <- shrinkageFun(gaps(gr1), max.gap = maxGap(gaps(gr1), 0.15))
shrink.fun2 <- shrinkageFun(gaps(gr1), max.gap = 0)
s1 <- shrink.fun1(gr1)
getXScale(s1)
# coord:genome
set.seed(1)
gr1 <- GRanges("chr1", IRanges(start = as.integer(runif(20, 1, 100)),
width = 5))
gr2 <- GRanges("chr2", IRanges(start = as.integer(runif(20, 1, 100)),
width = 5))
gr <- c(gr1, gr2)
gr.t <- transformToGenome(gr, space.skip = 1)
getXScale(gr.t)
showColor()
Show colors
Description
Show colors with color string or names of color vectors.
Usage
showColor(colors, label = c("color", "name"))
Arguments
Argument | Description |
---|---|
colors | A color character vector. |
label | "color" label color with simple color string, and "name" label color with names of the vectors. |
Value
A plot.
Author
Tengfei Yin
Examples
showColor(getBioColor("CYTOBAND"))
shrinkageFun_method()
Shrinkage function
Description
Create a shrinkage function based on specified gaps and shrinkage rate.
Usage
## For IRanges
list(list("shrinkageFun"), list("IRanges"))(obj, max.gap = 1L)
## For GenomicRanges
list(list("shrinkageFun"), list("GenomicRanges"))(obj, max.gap = 1L)
is_coord_truncate_gaps(obj)
Arguments
Argument | Description |
---|---|
obj | GenomicRanges object which represent gaps |
max.gap | Gaps to be kept, it's a fixed value, if this value is bigger than certain gap interval, then that gap is not going to be shrunk. |
Details
shrinkageFun
function will read in a GenomicRanges object
which represent the gaps, and return a function which works for
another GenomicRanges object, to shrink that object based on
previously specified gaps shrinking information. You could use
this function to treat multiple tracks(e.g. GRanges) to make sure
they shrunk based on the common gaps and the same ratio.
is_coord_truncate_gaps
is used to check if a GRanges
object is in "truncate_gaps" coordiantes or not.
Value
A shrinkage function which could shrink a GenomicRanges object, this function will add coord "truncate_gaps" and max.gap to metadata, ".ori" for oringal data as extra column
Author
Michael Lawrence, Tengfei Yin
Examples
library(GenomicRanges)
gr1 <- GRanges("chr1", IRanges(start = c(100, 300, 600),
end = c(200, 400, 800)))
shrink.fun1 <- shrinkageFun(gaps(gr1), max.gap = maxGap(gaps(gr1), 0.1))
shrink.fun2 <- shrinkageFun(gaps(gr1, start(gr1), end(gr1)), max.gap = maxGap(gaps(gr1), 0.1))
shrink.fun3 <- shrinkageFun(gaps(gr1), max.gap = 0)
s1 <- shrink.fun1(gr1)
s2 <- shrink.fun2(gr1)
s3 <- shrink.fun3(gr1)
metadata(s1)$coord
values(s1)$.ori
is_coord_truncate_gaps(s1)
splitByFacets_method()
split a GRanges by formula
Description
Split a GRanges by formula into GRangesList. Parse variables in
formula and form a interaction factors then split the GRanges
by the factos.
Usage
list(list("splitByFacets"), list("GRanges,formula"))(object, facets)
list(list("splitByFacets"), list("GRanges,GRanges"))(object, facets)
list(list("splitByFacets"), list("GRanges,NULL"))(object, facets)
list(list("splitByFacets"), list("GRanges,missing"))(object, facets)
Arguments
Argument | Description |
---|---|
object | GRanges object. |
facets | formula object, such as . ~ seqnames. Or GRanges object, or NULL . |
Details
if facets is formula, factors are crreated based on interaction of
variables in formula, then split it with this factor. If facets is
GRanges
object, it first subset the original data by facets
GRanges
, then split by each region in the facets. If facets is
NULL
, split just by seqnames. This function is used to
perform computation in conjunction with facets argments in higher
level graphic function.
Value
A GRangesList
.
Author
Tengfei Yin
Examples
library(GenomicRanges)
N <- 1000
gr <- GRanges(seqnames =
sample(c("chr1", "chr2", "chr3"),
size = N, replace = TRUE),
IRanges(
start = sample(1:300, size = N, replace = TRUE),
width = sample(70:75, size = N,replace = TRUE)),
strand = sample(c("+", "-", "*"), size = N,
replace = TRUE),
value = rnorm(N, 10, 3), score = rnorm(N, 100, 30),
sample = sample(c("Normal", "Tumor"),
size = N, replace = TRUE),
pair = sample(letters, size = N,
replace = TRUE))
facets <- sample ~ seqnames
splitByFacets(gr, facets)
splitByFacets(gr)
gr.sub <- GRanges("chr1", IRanges(c(1, 200, 250), width = c(50, 10,
30)))
splitByFacets(gr, gr.sub)
strip_formula_dots()
strip dots around a formula variables
Description
strip dots around variables in a formula.
Usage
strip_formula_dots(formula)
Arguments
Argument | Description |
---|---|
formula | A formula. such as coverage ~ seqnames, or ..coverage.. ~ seqnames. |
Value
A formula wihout ".." around for all variables.
Author
Tengfei Yin
Examples
obj <- ..coverage.. ~ seqnames
strip_formula_dots(obj)
subsetArgsByFormals()
Subset list of arguments by functions
Description
find arguments matched by formals of passed functions,
Usage
subsetArgsByFormals(args, ..., remove.dots = TRUE)
Arguments
Argument | Description |
---|---|
args | list of arguments with names indicate the formals. |
... | functions used to parse formals. |
remove.dots | logical value indicate whether to include dots in formals or not. |
Value
argumnets that matched with passed function.
Author
Tengfei Yin
Examples
args <- list(x = 1:3, simplify = TRUE, b = "b")
subsetArgsByFormals(args, plot, sapply)
transform()
Transform GRanges to different coordinates and layout
Description
Used for coordiante genome transformation, other transformation in circular view.
Usage
list(list("transformToGenome"), list("GRanges"))(data, space.skip = 0.1, chr.weight
= NULL)
list(list("transformToGenome"), list("GRangesList"))(data, space.skip = 0.1,
chr.weight = NULL)
list(list("transformToArch"), list("GRanges"))(data, width = 1)
transformToCircle(data, x = NULL, y = NULL, ylim = NULL,
radius = 10, trackWidth =10,
direction = c("clockwise", "anticlockwise"),
mul = 0.05)
transformToRectInCircle(data, y = NULL, space.skip = 0.1, trackWidth = 10, radius = 10,
direction = c("clockwise", "anticlockwise"),
n = 100, mul = 0.05, chr.weight = NULL)
transformToBarInCircle(data, y = NULL, space.skip = 0.1, trackWidth = 10, radius = 10,
direction = c("clockwise", "anticlockwise"),
n = 100, mul = 0.05, chr.weight = NULL)
transformToSegInCircle(data, y = NULL, space.skip = 0.1, trackWidth = 10, radius = 10,
direction = c("clockwise", "anticlockwise"), n =
100, chr.weight = NULL)
transformToLinkInCircle(data, linked.to, space.skip = 0.1, trackWidth = 10, radius = 10,
link.fun = function(x, y, n = 100) bezier(x, y, evaluation = n),
direction = c("clockwise", "anticlockwise"), chr.weight = NULL)
transformDfToGr(data, seqnames = NULL, start = NULL, end = NULL,
width = NULL, strand = NULL,
to.seqnames = NULL, to.start = NULL, to.end = NULL,
to.width = NULL, to.strand = NULL, linked.to
= to.gr)
list(list("transformToDf"), list("GRanges"))(data)
is_coord_genome(data)
Arguments
Argument | Description |
---|---|
data | a GRanges object. for function transformDfToGr it's data.frame. |
x | character for variable as x axis used for transformation. |
y | character for variable as y axis used for transformation. |
ylim | numeric range to control the ylimits on tracks when 'y' information is involved. |
space.skip | numeric values indicates skipped ratio of whole space, not skipped space is identical between each space. |
radius | numeric value, indicates radius when transform to a circle. |
trackWidth | numeric value, for track width. |
direction | "clockwise" or "counterclockwise", for layout or transform direction to circle. |
mul | numeric value, passed to expand_range function, to control margin of y in the track. |
n | integer value, control interpolated points numbers. |
linked.to | a column name of GRanges object, indicate the linked line's end point which represented as a GRanges too.. |
link.fun | function used to generate linking lines. |
seqnames | character or integer values for column name or index indicate variable mapped to seqnames, default NULL use "seqnames". |
start | character or integer values for column name or index indicate variable mapped to start, default NULL use "start". |
end | character or integer values for column name or index indicate variable mapped to end, default NULL use "end". |
width | character or integer values for column name or index indicate variable mapped to width, default NULL use "width". |
strand | character or integer values for column name or index indicate variable mapped to strand, default NULL use "strand". |
to.seqnames | character or integer values for column name or index indicate variable mapped to linked seqnames, default NULL , create GRanges without new GRanges attached as column. If this varialbe is not NULL , this mean you try to parse linked GRanges object. |
to.start | character or integer values for column name or index indicate variable mapped to start of linked GRanges, default NULL use "to.start". |
to.end | character or integer values for column name or index indicate variable mapped to end of linked GRanges, default NULL use "to.end". |
to.width | character or integer values for column name or index indicate variable mapped to width of linked GRanges, default NULL use "to.width". |
to.strand | character or integer values for column name or index indicate variable mapped to strand, default NULL use "to.strand" or just use *. |
chr.weight | numeric vectors which sum to <1, the names of vectors has to be matched with seqnames in seqinfo, and you can only specify part of the seqnames, other lengths of chromosomes will be assined proportionally to their seqlengths, for example, you could specify chr1 to be 0.5, so the chr1 will take half of the space and other chromosomes squeezed to take left of the space. |
Value
A GRanges
object, with calculated new variables, including
".circle.x" for transformed x position, ".circle.y" for transformed y
position, ".circle.angle" for transformed angle.
Author
Tengfei Yin
Examples
library(biovizBase)
library(GenomicRanges)
set.seed(1)
gr1 <- GRanges("chr1", IRanges(start = as.integer(runif(20, 1, 2e9)),
width = 5))
gr2 <- GRanges("chr2", IRanges(start = as.integer(runif(20, 1, 2e9)),
width = 5))
gr <- c(gr1, gr2)
gr.t <- transformToGenome(gr, space.skip = 0.1)
is_coord_genome(gr.t)
transformToCircle(gr.t)
transformToRectInCircle(gr)
transformToSegInCircle(gr)
values(gr1)$to.gr <- gr2
transformToLinkInCircle(gr1, linked.to = "to.gr")
transformGRangesForEvenSpace()
Transform GRanges with New Coordinates
Description
For graphics, like linked plot, e.g. generated by
qplotRangesLinkedToData
function in package ggbio.
we need to generate a new set of coordinates which is used for even
spaced statistics track.
Usage
transformGRangesForEvenSpace(gr)
Arguments
Argument | Description |
---|---|
gr | A GRanges object. |
Details
Most used internally for special graphics, like
qplotRangesLinkedToData
function in package ggbio.
Value
A GRanges object as passed in, with new column x.new which indicate the static track coordinates, in this way, we could map the new coordinates with the old one.
Author
Tengfei Yin
Examples
library(GenomicRanges)
gr <- GRanges("chr1", IRanges(seq(1,100, length.out = 10), width = 5))
library(biovizBase)
transformGRangesForEvenSpace(gr)