bioconductor v3.9.0 BiovizBase

The biovizBase package is designed to provide a set of

Link to this section Summary

Functions

CRC

GC content computation for BSgenome

Adding disjoint levels to a GenomicRanges object

Utils for parsing (un)evaluated arguments

biovizBase is a package which provides utilities and color scheme...

Color blind safe palette generator

Checking if string contains letters or not

crc1.GeRL

Fetching GRanges from various data source

Subset of RNA editing sites in hg19...

Estimation of Coverage

Transform GRangesList to GRanges

Gene symbols with position...

Color scheme getter for biological data

Get formals from functions

get gaps for a stepping transformed GRanges object

Get ideogram information

Get ideogram.

Get scale information from a GRanges

Hg19 ideogram without cytoband information...

Hg19 ideogram with cytoband information...

ideogram without cytoband information

ideogram with cytoband information

Ideogram checking

Utils for Splicing Summary

Simple ideogram checking

parse x and y label information from a specific object

Estimated max gaps

mold data into data.frame

Summarize reads for certain region

Show colors

get x scale breaks and labels

Show colors

Shrinkage function

split a GRanges by formula

strip dots around a formula variables

Subset list of arguments by functions

Transform GRanges to different coordinates and layout

Transform GRanges with New Coordinates

Link to this section Functions

CRC

Description

Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion sample data set

Usage

data(CRC)

Details

The data used in this study is from this a paper http://www.nature.com/ng/journal/v43/n10/full/ng.936.html.

Examples

data(CRC)

GC content computation for BSgenome

Description

Compute GC content in a certain region of a BSgenome object

Usage

GCcontent(obj, ..., view.width, as.prob = TRUE)

Arguments

ArgumentDescription
objBSgenome object
...Arguments passed to getSeq method for BSgenome package.
view.widthPassed to letterFrequencyInSlidingView , the constant (e.g. 35, 48, 1000) size of the "window" to slide along obj . The specified letters are tabulated in each window of length view.width . The rows of the result (see value) correspond to the various windows.
as.probIf TRUE return percentage of GC content, otherwise return counts.

Details

GC content is an interesting variable may be related to various biological questions. So we need a way to compute GC content in a certain region of a reference genome. GCcontent function is a wrapper around getSeq function in BSgenome package and letterFrequency , letterFrequencyInSlidingView in Biostrings package.

if the view.width is specified, the GC content will be computed in the sliding view

Value

Numeric value indicate count or percentage

Author

Tengfei Yin

Examples

library(BSgenome.Hsapiens.UCSC.hg19)
GCcontent(Hsapiens, GRanges("chr1", IRanges(1e6, 1e6 + 1000)))
Link to this function

addStepping_method()

Adding disjoint levels to a GenomicRanges object

Description

Adding disjoint levels to a GenomicRanges object

Usage

list(list("addStepping"), list("GenomicRanges"))(obj, group.name, extend.size = 0,
                       fix = "center",
                       group.selfish = TRUE)

Arguments

ArgumentDescription
objA GenomicRanges object
group.nameColumn name in the elementMetadata which specify the grouping information of all the entries. If provided, this will make sure all intervals belong to the same group will try to be on the same level and nothing falls in between.
extend.sizeAdding invisible buffered region to the GenomicRanges object, if it's 10, then adding 5 at both end. This make the close neighbors assigned to the different levels and make your eyes easy to identify.
fix"start", "end", or "center"(default) passed to resize , denoting what to use as an anchor for each element of obj, and add extend.size to it's current width.
group.selfishPassed to addStepping , control whether to show each group as unique level or not. If set to FALSE , if two groups are not overlapped with each other, they will probably be layout in the same level to save space.

Details

This is a tricky question, for example, pair-end RNA-seq data could be represented as two set of GenomicRanges object, one indicates the read, one indicates the junction. At the same time, we need to make sure pair-ended read are shown on the same level, and nothing falls in between. For better visualization of the data, we may hope to add invisible extended buffer to the reads, so closely neighbored reads will be on the different levels.

Value

A modified GenmicRanges object with stepping as one column.

Author

Tengfei Yin

Examples

library(GenomicRanges)
set.seed(1)
N <- 500
## sample GRanges
gr <- GRanges(seqnames =
sample(c("chr1", "chr2", "chr3", "chrX", "chrY"),
size = N, replace = TRUE),
IRanges(
start = sample(1:300, size = N, replace = TRUE),
width = sample(70:75, size = N,replace = TRUE)),
strand = sample(c("+", "-", "*"), size = N,
replace = TRUE),
value = rnorm(N, 10, 3), score = rnorm(N, 100, 30),
group = sample(c("Normal", "Tumor"),
size = N, replace = TRUE),
pair = sample(letters, size = N,
replace = TRUE))

## grouping and extending
head(addStepping(gr))
head(addStepping(gr, group.name = "pair"))
gr.close <- GRanges(c("chr1", "chr1"), IRanges(c(10, 20), width = 9))
addStepping(gr.close)
addStepping(gr.close, extend.size = 5)

Utils for parsing (un)evaluated arguments

Description

Utilities for parsing (un)evaluated arguments

Usage

parseArgsForAes(args)
parseArgsForNonAes(args)

Arguments

ArgumentDescription
argsarguments list.

Value

For parseArgsForAes return a unevaluated arguments.

For parseArgsNonForAes return a evaluated/quoted arguments.

Author

Tengfei Yin

Examples

args <- alist(a = color, b = "b")
# parseArgsForAes(args)
Link to this function

biovizBase_package()

biovizBase is a package which provides utilities and color scheme...

Description

biovizBase is a package which provides utilities and color scheme for higher level graphic package which aim to visualize biological data especially genetic data.

Details

This package provides default color scheme for nucleotide, strand, amino acid, try to pass colorblind checking as possible as we can. And also provide giemsa stain result color scheme used to show cytoband. This package also provides utilites to manipulate and summarize raw data to get them ready to be visualized.

Link to this function

colorBlindSafePal()

Color blind safe palette generator

Description

This function help users create a function based on specified color blind safe palette. And the returned function could be used for color generation.

Usage

genDichromatPalInfo()
genBrewerBlindPalInfo()
genBlindPalInfo()
colorBlindSafePal(palette)
blind.pal.info
brewer.pal.blind.info
dichromat.pal.blind.info

Arguments

ArgumentDescription
paletteA index numeric value or character. Please see blind.pal.info , the palette could be "pal_id" or names the row in which users could specify the palette you want to use.

Details

We get those color-blind safe palette based on http://colorbrewer2.org/ and http://geography.uoregon.edu/datagraphics/color_scales.htm those color are implemented in two packages, RColorBrewer and dichromat. But RColorBrewer doesn't provide subset of color-blind safe palette info. And dichromat doesn't group color based on "quality", "sequential" and "divergent", so we pick those color manually and create a combined palette, blind.pal.info.

colorBlindSafePal will return a function, this function will take two arguments, n and repeatable . if n is smaller than 3(n >= 3 is required by RColorBrewer), we use 3 instead and return a color vector. If n is over the maxcolors column in blind.pal.info, we use maxcolors instead when repeatable set to FALSE , if repeatable set to TRUE we repeat the color of all the allowed colors(length equals maxcolors ) in the same order. This has special case in certain graphics which is always displayed side by side and don't worry about the repeated colors being neighbors.

genBrewerBlindPalInfo return brewer.pal.blind.info data frame containing all color-blind safe palettes from brewer.pal.info defined in RColorBrewer, but it's not only just subset of it, it also changes some maxcolors info.

genDichromatPalInfo return dichromat.pal.blind.info data frame.

genBlindPalInfo return blind.pal.info data frame.

Value

A color generating function with arguments n and repeatable . n specifying how many different discrete colors you want to get from them palette, and if repeatable turned on and set to TRUE, you can specify n even larger than maximum color. The color will be repeated following the same order.

Author

Tengfei Yin

Examples

library(scales)
## brewer subse of only color blind palette
brewer.pal.blind.info
genBrewerBlindPalInfo()
## dichromat info
dichromat.pal.blind.info
genDichromatPalInfo()
## all color blind palette, adding id/pkg.
blind.pal.info
## with no parameters, just return blind.pal.info
colorBlindSafePal()
mypal <- colorBlindSafePal(20)
## or pass character name
mypal <- colorBlindSafePal("Set2")
mypal12 <- colorBlindSafePal(22)
show_col(mypal(12, repeatable = FALSE)) # warning
show_col(mypal(11, repeatable = TRUE))  # no warning, and repeat
show_col(mypal12(12))
Link to this function

containLetters()

Checking if string contains letters or not

Description

Test if a string contain any letters

Usage

containLetters(obj, all=FALSE)

Arguments

ArgumentDescription
objString
allIf set to FALSE, return TRUE when any letters appears; if all is set to TRUE, return TRUE only when the string is composed of just letters.

Details

Useful when processing/sorting seqnames

Value

Logical value

Author

tengfei

Examples

containLetters("XYZ123")
containLetters("XYZ123", TRUE)

crc1.GeRL

Description

CRC-1 mutation and structural rearrangment

Usage

data(crc1.GeRL)

Details

GenomicRangesList contains somatic mutation, reaarangment etc for a tumor sample, please check the reference for detials.

References

Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion

Examples

data(crc1.GeRL)
crc1.GeRL
Link to this function

crunch_method()

Fetching GRanges from various data source

Description

Fetching Granges from various data source, currently supported objects are TxDb, EnsDb, GAlignments and BamFile.

Usage

list(list("crunch"), list("TxDb"))(obj, which, columns = c("tx_id", "tx_name","gene_id"),
       type = c("all", "reduce"), truncate.gaps = FALSE,
       truncate.fun = NULL, ratio = 0.0025)
list(list("crunch"), list("EnsDb"))(obj, which, columns = c("tx_id", "gene_name","gene_id"),
       type = c("all", "reduce"), truncate.gaps = FALSE,
       truncate.fun = NULL, ratio = 0.0025)
list(list("crunch"), list("GAlignments"))(obj, which, truncate.gaps = FALSE,
       truncate.fun = NULL, ratio = 0.0025)
list(list("crunch"), list("BamFile"))(obj, which, ..., type = c("gapped.pair", "raw", "all"),
       truncate.gaps = FALSE, truncate.fun = NULL, ratio = 0.0025)

Arguments

ArgumentDescription
objsupported objects are TxDb , GAlignments and BamFile .
whichGRanges object. For TxDb object, could aslo be a list. For EnsDb , it can also be a single filter object extending AnnotationFilter-class , an AnnotationFilterList combining several such objects or a filter expression in form of a formula (see AnnotationFilter for examples).
columnscolumns to include in the output.
typedefault 'all' is to show the full model, 'reduce' is to show a single model.
truncate.gapslogical value, default FALSE . Whether to truncate gaps or not.
truncate.funshrinkage function.
rationumeric value, shrinking ratio.
...arguments passed to function readGAlignments .

Value

GRanges object.

Author

Tengfei Yin

Examples

library(biovizBase)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
data(genesymbol, package = "biovizBase")
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
obj <- txdb
temp <- crunch(txdb, which = genesymbol["BRCA1"], type = "all")
temp <- crunch(txdb, which = genesymbol["BRCA1"], type = "reduce")

## Fetching data from a EnsDb object.
library(EnsDb.Hsapiens.v75)
edb <- EnsDb.Hsapiens.v75
gr <- genesymbol["BRCA1"]
seqlevels(gr) <- sub(seqlevels(gr), pattern="chr", replacement="")
temp <- crunch(edb, which = gr)

## Alternatively, use the GenenameFilter from the AnnotationFilter package:
temp <- crunch(edb, which = GenenameFilter("BRCA1"))

## Or a filter expression
temp <- crunch(edb, which = ~ genename == "BRCA1")
Link to this function

darned_hg19_subset500()

Subset of RNA editing sites in hg19...

Description

Subset of RNA editing sites in hg19

Usage

data(darned_hg19_subset500)

Details

This data set provides a subset(500 sites only) of hg19 RNA editing sites, and originally from DARNED http://darned.ucc.ie/ for the hg19 assembly.

Examples

data(darned_hg19_subset500)
darned_hg19_subset500
Link to this function

estimateCoverage_method()

Estimation of Coverage

Description

Estimation of Coverage

Usage

list(list("estimateCoverage"), list("BamFile"))(x, maxBinSize = 2^14)

Arguments

ArgumentDescription
xA BamFile object.
maxBinSizeMax bin size.

Value

A GRanges object.

Author

Michael Lawrence

Transform GRangesList to GRanges

Description

Transform GRangesList to GRanges.

Usage

flatGrl(object, indName = "grl_name")

Arguments

ArgumentDescription
objecta GRangesList object.
indNamecolumn named by 'indName' that groups the records by their original element in 'object'.

Details

This method is different from default stack , it integrate elementMetadata of GRangesList to the final coerced GRanges .

Value

A GRanges object.

Author

Tengfei Yin

Examples

library(GenomicRanges)
gr1 <- GRanges("chr1", IRanges(1:10, width = 5))
gr2 <- GRanges("chr2", IRanges(1:10, width = 5))
obj <- GRangesList(gr1, gr2)
values(obj) <- data.frame(a = 1:2, b = letters[1:2])
stack(obj)
flatGrl(obj)

Gene symbols with position...

Description

Gene symbols with position

Usage

data(genesymbol)

Details

This data set provides genen symbols in human with position and starnd information, stored as GRanges object.

Examples

data(genesymbol)
head(genesymbol)
genesymbol["RBM17"]

Color scheme getter for biological data

Description

This function tries to get default color scheme either from fixed palette or options for specified data set, usually just biological data.

Usage

getBioColor(type = c("DNA_BASES_N", "DNA_BASES", "DNA_ALPHABET",
                     "RNA_BASES_N", "RNA_BASES", "RNA_ALPHABET",
                     "IUPAC_CODE_MAP", "AMINO_ACID_CODE", "AA_ALPHABET",
                     "STRAND", "CYTOBAND"),
            source = c("option", "default"))

Arguments

ArgumentDescription
typeColor set based on which you want to get.
source"option" tries to get color scheme from Options . This allow user to edit the color globally. "default" gets fixed color scheme.

Details

Most data set specified in the type argument are defined in Biostrings package, such as "DNA_BASES", "DNA_ALPHABET", "RNA_BASES", "RNA_ALPHABET", "IUPAC_CODE_MAP", "AMINO_ACID_CODE", "AA_ALPHABET", please check the manual for more details.

"DNA_BASES_N" is just "DNA_BASES" with extra "N" used in certain cases, like the result from applyPileup in Rsamtools package. We start with the five most used nucleotides, A,T,C,G,N , In genetics, GC-content usually has special biological significance because GC pair is bound by three hydrogen bonds instead of two like AT pairs. So it has higher thermostability which could result in different significance, like higher annealing temperature in PCR. So we hope to choose warm color for G,C and cold color for A,T , and a color in between to represent N . They are chosen from diverging color set of color brewer . So we should be able to easily tell the GC enriched region. This set of color also passed vischeck for colorblind people.

In GRanges object, we have strand which contains three levels, +, -, *. We are using a qualitative color set from Color . This color set pass the colorblind test. It should be a safe color set to use to color strand.

For most default color scheme if they are under 18, we are trying to use package dichromat to set color for color blind people. But for data set that contains more than 18 objects, it's not possible to assign colorblind-safe color to them anymore. So we need to repeat some color. It should not matter too much, because even normal people cannot tell the difference anymore.

Here are the definition for the data sets.

list(" ", " ", list(list("DNA_BASES"), list(" ", " Contains ", list("A,C,T,G"), ". ", " ")), " ", " ", list(list("DNA_ALPHABET"), list(" ", " This alphabet contains all letters from the IUPAC Extended Genetic ", " Alphabet (see "?IUPAC_CODE_MAP") + the gap ("-") and the hard ", " masking ("+") letters.
", " ")), " ", " ", list(list("DNA_BASES_N"), list(" ", " Contains ", list("A,C,T,G,N"), " ", " ")), " ", " ", list(list("RNA_BASES_N"),

list("

", " Contains ", list("A,C,U,G,N"), " ", " ")), " ", " ", list(list("RNA_BASES"), list(" ", " Contains ", list("A,C,T,G"), " ", " ")), " ", " ", list(list("RNA_ALPHABET"), list(" ", " This alphabet contains all letters from the IUPAC Extended Genetic ", " Alphabet (see ?IUPAC_CODE_MAP) where "T" is replaced by "U" ", " + the gap ("-") and the hard masking ("+") letters. ", " ")), " ", " ", list(list("IUPAC_CODE_MAP"), list(" ",

"      The "IUPAC_CODE_MAP" named character vector contains the mapping

", " from the IUPAC nucleotide ambiguity codes to their meaning. ", " ")), " ", " ", list(list("AMINO_ACID_CODE"), list(" ", " Single-Letter Amino Acid Code (see "?AMINO_ACID_CODE"). ", " ")), " ", " ", list(list("AA_ALPHABET"), list(" ", " This alphabet contains all letters from the Single-Letter Amino ", " Acid Code (see "?AMINO_ACID_CODE") + the stop ("*"), the gap ", " ("-") and the hard masking ("+") letters
",

"    ")), "

", " ", list(list("STRAND"), list(" ", " Contains ", list(""+", "-", "*""), " ", " ")), " ", " ", list(list("CYTOBAND"), list(" ", " Contains giemsa stain results:", list("gneg, gpos25, gpos50, gpos75, ", " gpos100, gvar, stalk, acen"), ". Color defined in package geneplotter. ", " ")), " ", " ")

Value

A character of vector contains color(rgb), and the names of the vector is originally from the name of different data set. e.g. for DNA_BASES, it's just A,C,T,G . This allow users to get color for a vector of specified names. Please see the examples below.

Author

Tengfei Yin

Examples

opts <- getOption("biovizBase")
opts$DNABasesNColor[1] <- "red"
options(biovizBase = opts)
## get from option(default)
getBioColor("DNA_BASES_N")
## get default fixed color
getBioColor("DNA_BASES_N", source = "default")
seqs <- c("A", "C", "T", "G", "G", "G", "C")
## get colors for a sequence.
getBioColor("DNA_BASES_N")[seqs]
Link to this function

getFormalNames()

Get formals from functions

Description

Get formals from functions, used for dispatching arguments inside.

Usage

getFormalNames(..., remove.dots = TRUE)

Arguments

ArgumentDescription
...functions.
remove.dotslogical value, indicate remove dots in formals or not, default is TRUE .

Value

A character vector for formal arguments.

Author

Tengfei Yin

Examples

getFormalNames(plot, sapply)

get gaps for a stepping transformed GRanges object

Description

Get gaps for a stepping transformed GRanges object, for visualization purpose. So a extra "stepping" column is required. Please see details below for motivation.

Usage

list(list("getGaps"), list("GRanges"))(obj, group.name = NULL, facets = NULL)

Arguments

ArgumentDescription
objA GRanges object.
group.namegroup name, such as transcript ID, this is the group method within each panel of facets and gaps will be computed for each group of intervals.
facetsformula used for creating graphics, all variables must be present in the data.

Details

Since faceting is a subset and group process in visualization stage, some statistical computation need to be taken place after that. This leaves some computation like computing gaps hard based on solely GRanges object. Extra information like facets formula and group method would help to generate gaps which make sure they are aligned on the same level and within the same panel for grouped intervals. facets variables will be added to gaps GRanges along with group.name.

Value

A GRanges object representing gaps and for each row, the "stepping" column help later visualization and make sure gaps and intervals they are generated from are showed on the expected place.

Author

Tengfei Yin

Examples

set.seed(1)
N <- 100
library(GenomicRanges)
gr <- GRanges(seqnames =
sample(c("chr1", "chr2", "chr3"),
size = N, replace = TRUE),
IRanges(
start = sample(1:300, size = N, replace = TRUE),
width = sample(70:75, size = N,replace = TRUE)),
strand = sample(c("+", "-", "*"), size = N,
replace = TRUE),
value = rnorm(N, 10, 3), score = rnorm(N, 100, 30),
sample = sample(c("Normal", "Tumor"),
size = N, replace = TRUE),
pair = sample(letters, size = N,
replace = TRUE))

grl <- splitByFacets(gr, sample ~ seqnames)
gr <- unlist(endoapply(grl, addStepping))
gr.gaps <- getGaps(gr, group.name = "stepping", facets = sample ~ seqnames)

Get ideogram information

Description

This function tries to parse ideogram information from seqlengths of a GRanges object. This information is usually used to plot chromosome background for kaytogram or esitmate proper lengths of chromosomes from data space for showing overview.

Usage

getIdeoGR(gr)

Arguments

ArgumentDescription
grA GRanges object with or without lengths information.

Value

A ideogram GRanges object, each row indicate one single chromosome, and start with 1 and end with real chromosome length or estimated lengths.

Author

Tengfei Yin

Examples

library(GenomicRanges)
data("hg19IdeogramCyto", package = "biovizBase")
hg19IdeogramCyto
## without seqlengths, simply reduce
getIdeoGR(hg19IdeogramCyto)
## with seqlengths
gr <- GRanges("chr1",  IRanges(1,3))
seqlevels(gr) <- c("chr1", "chr2", "chr3")
nms <- c(100, 200, 300)
names(nms) <- c("chr1", "chr2", "chr3")
seqlengths(gr) <- nms
gr
getIdeoGR(gr)

Get ideogram.

Description

Get ideogram w/o cytoband for certain genome

Usage

getIdeogram(genome, subchr, cytobands=TRUE)

Arguments

ArgumentDescription
genomeSingle specie names, which must be one of the result from ucscGenomes()$db . If missing, will invoke a menu for users to choose from.
subchrA character vector used to subset the result.
cytobandsIf TRUE, return ideogram with gieStain and name column. If FALSE, simply return the genome information for each chromosome.

Details

This function require a network connection, it will parse the data on the fly, function is a wrapper of some functionality from rtracklayer package to get certain table like cytoBand, a full table schema could be found http://genome.ucsc.edu/cgi-bin/hgTables in UCSC genome browser.This is useful for visualization of the whole genome or single chromosome, you can see some examples in ggbio package.

Value

A GRanges object.

Author

Tengfei Yin

Examples

hg19IdeogramCyto <- getIdeogram("hg19", cytoband = TRUE)

Get scale information from a GRanges

Description

Trying to get scale information from a GRanges object, used for circular view for geom "scale".

Usage

getScale(gr, unit = NULL, n = 100, type = c("M", "B", "sci"))

Arguments

ArgumentDescription
gra GRanges object.
unitA numeric value for scale unit. Default NULL use argument n to estimate the unit.
nInteger value to indicate how many scale ticks to make.
typeunit types to shown.

Value

A GRanges object, with extra column: "type" indicate it's longer major ticks or shorter minor ticks. "scale.y" indicates y height for major and minor ticks. Default ratio is 3:1.

Author

Tengfei Yin

Hg19 ideogram without cytoband information...

Description

Hg19 ideogram without cytoband information

Usage

data(hg19Ideogram)

Details

This data set provides hg19 genome information wihout cytoband information.

Examples

data(hg19Ideogram)
hg19Ideogram
Link to this function

hg19IdeogramCyto()

Hg19 ideogram with cytoband information...

Description

Hg19 ideogram with cytoband information

Usage

data(hg19IdeogramCyto)

Details

This data set provides hg19 genome information with cytoband information.

Examples

data(hg19IdeogramCyto)
hg19IdeogramCyto

ideogram without cytoband information

Description

ideogram without cytoband information

Usage

data(ideo)

Details

This data set provides hg19, hg18, mm10, mm9 genome information wihout cytoband information as a lit.

Examples

data(ideo)
ideo

ideogram with cytoband information

Description

ideogram with cytoband information

Usage

data(ideoCyto)

Details

This data set provides hg19, hg18, mm10, mm9 genome information with cytoband information as a lit.

Examples

data(ideoCyto)
ideoCyto

Ideogram checking

Description

Check if an object is ideogram or not

Usage

isIdeogram(obj)

Arguments

ArgumentDescription
objobject

Details

Simply test if it's the result coming from getIdeogram function or not, make sure it's a GRanges and with extra column

Value

A logical value to indicate it's a ideogram or not.

Author

Tengfei Yin

Examples

data(hg19IdeogramCyto)
data(hg19Ideogram)
isIdeogram(hg19IdeogramCyto)
isIdeogram(hg19Ideogram)
Link to this function

isMatchedWithModel()

Utils for Splicing Summary

Description

Utilities used for summarizing isoforms

Usage

isJunctionRead(cigar)
isMatchedWithModel(model, gr)

Arguments

ArgumentDescription
cigarA CIGAR string vector.
modelA GRanges object.
grA GRanges object.

Details

isJunctionRead simply parsing the CIGAR string to see if there is "N" in between and return a logical vector of the same length as cigar parameters, indicate it's junction read or not.

isMatchedWithModel mapping gr to model , and counting overlapped cases for each row of model, If gr contains all the read, this will return a logical vector of the same length as gr , and indicate if each read is the support for this model. NOTICE: we only assume it's a full model, so each model here is simply one isoform. So we only treat the gaped reads which only overlapped with two consecutive exons in model as one support for it.

Value

Logical vectors.

Author

Tengfei Yin

Examples

library(GenomicAlignments)
bamfile <- system.file("extdata", "SRR027894subRBM17.bam",
package="biovizBase")
## get index of junction read
which(isJunctionRead(cigar(readGAlignments(bamfile))))
##
model <- GRanges("chr1", IRanges(c(10, 20, 30, 40), width  = 5))
gr <- GRanges("chr1", IRanges(c(10, 10, 12, 22, 33), c(31, 40, 22, 32,
44)))
isMatchedWithModel(model, gr)
Link to this function

isSimpleIdeogram()

Simple ideogram checking

Description

Check if an object is a simple ideogram or not

Usage

isSimpleIdeogram(obj)

Arguments

ArgumentDescription
objobject

Details

test if it's GRanges or not, doesn't require cytoband information. But it double check to see if there is only one entry per chromosome

Value

A logical value to indicate it's a simple ideogram or not.

Author

tengfei

Examples

data(hg19IdeogramCyto)
data(hg19Ideogram)
isSimpleIdeogram(hg19IdeogramCyto)
isSimpleIdeogram(hg19Ideogram)

parse x and y label information from a specific object

Description

parse y label information, object specific.

Usage

list(list("getYLab"), list("TxDb"))(obj)
list(list("getXLab"), list("GRanges"))(obj)
list(list("getXLab"), list("GRangesList"))(obj)
list(list("getXLab"), list("GAlignments"))(obj)

Arguments

ArgumentDescription
objA TxDb object.

Value

a string.

Author

Tengfei Yin

Link to this function

maxGap_method()

Estimated max gaps

Description

Compute an estimated max gap information, which could be used as max.gap argument in shringkageFun .

Usage

list(list("maxGap"), list("GenomicRanges"))(obj, ratio = 0.0025)

Arguments

ArgumentDescription
objGenomicRanges object
ratioMultiple by the range of the provided gaps as the max gap.

Details

This function tries to estimate an appropriate max gap to be used for creating a shrinkage function.

Value

A numeric value

Seealso

shrinkageFun

Author

Tengfei Yin

Examples

require(GenomicRanges)
gr1 <- GRanges("chr1", IRanges(start = c(100, 300, 600),
end = c(200, 400, 800)))
gr2 <- GRanges("chr1", IRanges(start = c(100, 350, 550),
end = c(220, 500, 900)))
gaps.gr <- intersect(gaps(gr1, start = min(start(gr1))),
gaps(gr2, start = min(start(gr2))))
shrink.fun <- shrinkageFun(gaps.gr, max.gap = maxGap(gaps.gr))
shrink.fun(gr1)
shrink.fun(gr2)

mold data into data.frame

Description

mold data into data.frame usued for visualization, so it's sometimes not as simple as coersion, for example, we add midpoint varialbe for mapping.

Usage

list(list("mold"), list("IRanges"))(data)
list(list("mold"), list("GRanges"))(data)
list(list("mold"), list("GRangesList"))(data,indName = "grl_name")
list(list("mold"), list("Seqinfo"))(data)
list(list("mold"), list("matrix"))(data)
list(list("mold"), list("eSet"))(data)
list(list("mold"), list("ExpressionSet"))(data)
list(list("mold"), list("RangedSummarizedExperiment"))(data, assay.id = 1)
list(list("mold"), list("Views"))(data)
list(list("mold"), list("Rle"))(data)
list(list("mold"), list("RleList"))(data)
list(list("mold"), list("VRanges"))(data)

Arguments

ArgumentDescription
datadata to be mold into data.frame with additional column that helped mapping.For example we add 'midpoint' variable to a ranged converted data.frame .
indNamewhen collapsing a GRangesList , list names are aggregated into extra column named by this parameter.
assay.iddefine the assay id you want to convert into a data.frame .

Value

a data.frame object.

Author

Tengfei Yin

Link to this function

pileupAsGRanges()

Summarize reads for certain region

Description

This function summarize reads from bam files for nucleotides on single base unit in a given region, this allows the downstream mismatch summary analysis.

Usage

pileupAsGRanges(bams, regions, DNABases=c("A", "C", "G", "T", "N"), ...)

Arguments

ArgumentDescription
bamsA character which specify the bam file path.
regionsA GRanges object specifying the region to be summarized. This passed to which arguments in ApplyPileupsParam .
DNABasesNucleotide type you want to summarize in the result and in specified order. It must be one or more of A,C,G,T,N.
...Extra parameters passed to ApplyPileupsParam .

Details

It's a wrapper around applyPileup function in Rsamtools package, more detailed control could be found under manual of ApplyPileupsParam function in Rsamtools. pileupAsGRanges function return a GRanges object which including summary of nucleotides, depth, bam file path. This object could be read directly into pileupGRangesAsVariantTable function for mismatch summary.

Value

A GRanges object, each row is one single base unit. and elementMetadata contains summary about this position about all nucleotides specified by DNABases. and depth for total reads, bam for file path.

Author

Michael Lawrence, Tengfei Yin

Examples

library(Rsamtools)
data(genesymbol)
library(BSgenome.Hsapiens.UCSC.hg19)
bamfile <- system.file("extdata", "SRR027894subRBM17.bam", package="biovizBase")
test <- pileupAsGRanges(bamfile, region = genesymbol["RBM17"])
test.match <- pileupGRangesAsVariantTable(test, Hsapiens)
head(test[,-7])
head(test.match[,-5])
Link to this function

pileupGRangesAsVariantTable()

Mismatch summary

Description

Compare to reference genome and compute mismatch summary for certain region of reads.

Usage

pileupGRangesAsVariantTable(gr, genome, DNABases=c("A", "C", "G", "T", "N"))

Arguments

ArgumentDescription
grA GRanges object, with nucleotides summary, each base take one column in elementMetadata or user can simply passed the returned result from pileupAsGRanges function to this function.
genomeBSgenome object, need to be the reference genome.
DNABasesNucleotide types contained in passed GRanges object. Default is A/C/G/T/N, it tries to match the column names in elementMetadata to those default nucleotides. And treat the matched column as base names.

Details

User need to make sure to pass the right reference genome to this function to get the right summary. This function drop the position has no reads and only keep the region with coverage in the summary. The result could be used to show stacked barchart for mismatch summary.

Value

A GRanges object. Containing the following elementMetadata

  • list("ref") list("Nucleotide in reference genome.")

  • list("read") list("Nucleotide contained in the reads at particular position, ", "if multiple nucleotide, either matched or unmatched are found, they will be ", "summarized in different rows.")

  • list("count") list("Count for read column.")

  • list("match") list("Logical value, whether matched to reference genome or not")

  • list("bam") list("Character indicate bam file path.")

Author

Michael Lawrence, Tengfei Yin

Examples

library(Rsamtools)
data(genesymbol)
library(BSgenome.Hsapiens.UCSC.hg19)
bamfile <- system.file("extdata", "SRR027894subRBM17.bam", package="biovizBase")
test <- pileupAsGRanges(bamfile, region = genesymbol["RBM17"])
test.match <- pileupGRangesAsVariantTable(test, Hsapiens)
head(test[,-7])
head(test.match[,-5])
Link to this function

plotColorLegend()

Show colors

Description

Plot color legend, simple way to check your default color scheme

Usage

plotColorLegend(colors, labels, title)

Arguments

ArgumentDescription
colorsA character vector of colors
labelsLabels to put aside colors, if missing , use names of the colors character vector.
titleTitle for the color legend.

Details

Show color sheme as a legend style, labels

Value

A graphic device showing color legend.

Author

Tengfei Yin

Examples

cols <- getOption("biovizBase")$baseColor
plotColorLegend(cols, title = "strand legend")

get x scale breaks and labels

Description

get x scale breaks and labels for GRanges with different coordintes(currently only "truncate_gaps" and "genome" supported).

Usage

list(list("getXScale"), list("GRanges"))(obj, type = c("default", "all", "left", "right"))

Arguments

ArgumentDescription
obja GRanges object. "coord" in metadata shows proper coordinates transformation for this object.
typetypes of labels for transformed data.

Value

list of breaks and labels.

Author

Tengfei Yin

Examples

library(GenomicRanges)
gr1 <- GRanges("chr1", IRanges(start = c(100, 300, 600),
end = c(200, 400, 800)))
shrink.fun1 <- shrinkageFun(gaps(gr1), max.gap = maxGap(gaps(gr1), 0.15))
shrink.fun2 <- shrinkageFun(gaps(gr1), max.gap = 0)
s1 <- shrink.fun1(gr1)
getXScale(s1)
#  coord:genome
set.seed(1)
gr1 <- GRanges("chr1", IRanges(start = as.integer(runif(20, 1, 100)),
width = 5))
gr2 <- GRanges("chr2", IRanges(start = as.integer(runif(20, 1, 100)),
width = 5))
gr <- c(gr1, gr2)
gr.t <- transformToGenome(gr, space.skip = 1)
getXScale(gr.t)

Show colors

Description

Show colors with color string or names of color vectors.

Usage

showColor(colors, label = c("color", "name"))

Arguments

ArgumentDescription
colorsA color character vector.
label"color" label color with simple color string, and "name" label color with names of the vectors.

Value

A plot.

Author

Tengfei Yin

Examples

showColor(getBioColor("CYTOBAND"))
Link to this function

shrinkageFun_method()

Shrinkage function

Description

Create a shrinkage function based on specified gaps and shrinkage rate.

Usage

## For IRanges
list(list("shrinkageFun"), list("IRanges"))(obj, max.gap = 1L)
## For GenomicRanges
list(list("shrinkageFun"), list("GenomicRanges"))(obj, max.gap = 1L)
is_coord_truncate_gaps(obj)

Arguments

ArgumentDescription
objGenomicRanges object which represent gaps
max.gapGaps to be kept, it's a fixed value, if this value is bigger than certain gap interval, then that gap is not going to be shrunk.

Details

shrinkageFun function will read in a GenomicRanges object which represent the gaps, and return a function which works for another GenomicRanges object, to shrink that object based on previously specified gaps shrinking information. You could use this function to treat multiple tracks(e.g. GRanges) to make sure they shrunk based on the common gaps and the same ratio.

is_coord_truncate_gaps is used to check if a GRanges object is in "truncate_gaps" coordiantes or not.

Value

A shrinkage function which could shrink a GenomicRanges object, this function will add coord "truncate_gaps" and max.gap to metadata, ".ori" for oringal data as extra column

Author

Michael Lawrence, Tengfei Yin

Examples

library(GenomicRanges)
gr1 <- GRanges("chr1", IRanges(start = c(100, 300, 600),
end = c(200, 400, 800)))

shrink.fun1 <- shrinkageFun(gaps(gr1), max.gap = maxGap(gaps(gr1), 0.1))
shrink.fun2 <- shrinkageFun(gaps(gr1, start(gr1), end(gr1)), max.gap = maxGap(gaps(gr1), 0.1))
shrink.fun3 <- shrinkageFun(gaps(gr1), max.gap = 0)
s1 <- shrink.fun1(gr1)
s2 <- shrink.fun2(gr1)
s3 <- shrink.fun3(gr1)
metadata(s1)$coord
values(s1)$.ori
is_coord_truncate_gaps(s1)
Link to this function

splitByFacets_method()

split a GRanges by formula

Description

Split a GRanges by formula into GRangesList. Parse variables in formula and form a interaction factors then split the GRanges by the factos.

Usage

list(list("splitByFacets"), list("GRanges,formula"))(object, facets)
list(list("splitByFacets"), list("GRanges,GRanges"))(object, facets)
list(list("splitByFacets"), list("GRanges,NULL"))(object, facets)
list(list("splitByFacets"), list("GRanges,missing"))(object, facets)

Arguments

ArgumentDescription
objectGRanges object.
facetsformula object, such as . ~ seqnames. Or GRanges object, or NULL .

Details

if facets is formula, factors are crreated based on interaction of variables in formula, then split it with this factor. If facets is GRanges object, it first subset the original data by facets GRanges , then split by each region in the facets. If facets is NULL , split just by seqnames. This function is used to perform computation in conjunction with facets argments in higher level graphic function.

Value

A GRangesList .

Author

Tengfei Yin

Examples

library(GenomicRanges)
N <- 1000
gr <- GRanges(seqnames =
sample(c("chr1", "chr2", "chr3"),
size = N, replace = TRUE),
IRanges(
start = sample(1:300, size = N, replace = TRUE),
width = sample(70:75, size = N,replace = TRUE)),
strand = sample(c("+", "-", "*"), size = N,
replace = TRUE),
value = rnorm(N, 10, 3), score = rnorm(N, 100, 30),
sample = sample(c("Normal", "Tumor"),
size = N, replace = TRUE),
pair = sample(letters, size = N,
replace = TRUE))
facets <- sample ~ seqnames
splitByFacets(gr, facets)
splitByFacets(gr)
gr.sub <- GRanges("chr1", IRanges(c(1, 200, 250), width = c(50, 10,
30)))
splitByFacets(gr, gr.sub)
Link to this function

strip_formula_dots()

strip dots around a formula variables

Description

strip dots around variables in a formula.

Usage

strip_formula_dots(formula)

Arguments

ArgumentDescription
formulaA formula. such as coverage ~ seqnames, or ..coverage.. ~ seqnames.

Value

A formula wihout ".." around for all variables.

Author

Tengfei Yin

Examples

obj <- ..coverage.. ~ seqnames
strip_formula_dots(obj)
Link to this function

subsetArgsByFormals()

Subset list of arguments by functions

Description

find arguments matched by formals of passed functions,

Usage

subsetArgsByFormals(args, ..., remove.dots = TRUE)

Arguments

ArgumentDescription
argslist of arguments with names indicate the formals.
...functions used to parse formals.
remove.dotslogical value indicate whether to include dots in formals or not.

Value

argumnets that matched with passed function.

Author

Tengfei Yin

Examples

args <- list(x = 1:3, simplify = TRUE, b = "b")
subsetArgsByFormals(args, plot, sapply)

Transform GRanges to different coordinates and layout

Description

Used for coordiante genome transformation, other transformation in circular view.

Usage

list(list("transformToGenome"), list("GRanges"))(data, space.skip = 0.1, chr.weight
= NULL)
list(list("transformToGenome"), list("GRangesList"))(data, space.skip = 0.1,
chr.weight = NULL)
list(list("transformToArch"), list("GRanges"))(data, width = 1)
transformToCircle(data, x = NULL, y = NULL, ylim = NULL, 
                  radius = 10, trackWidth =10,
                  direction = c("clockwise", "anticlockwise"),
                   mul = 0.05)
transformToRectInCircle(data, y = NULL, space.skip = 0.1, trackWidth = 10, radius = 10,
                      direction = c("clockwise", "anticlockwise"),
                      n = 100, mul = 0.05, chr.weight = NULL)
transformToBarInCircle(data, y = NULL, space.skip = 0.1, trackWidth = 10, radius = 10,
                     direction = c("clockwise", "anticlockwise"),
                      n = 100, mul = 0.05, chr.weight = NULL)
transformToSegInCircle(data, y = NULL, space.skip = 0.1, trackWidth = 10, radius = 10,
                      direction = c("clockwise", "anticlockwise"), n =
                      100, chr.weight = NULL)
transformToLinkInCircle(data, linked.to, space.skip = 0.1, trackWidth = 10, radius = 10,
                      link.fun = function(x, y, n = 100) bezier(x, y, evaluation = n),
                      direction = c("clockwise", "anticlockwise"), chr.weight = NULL)
transformDfToGr(data, seqnames = NULL, start = NULL, end = NULL,
                            width = NULL, strand = NULL,
                            to.seqnames = NULL, to.start = NULL, to.end = NULL,
                            to.width = NULL, to.strand = NULL, linked.to
                            = to.gr)
list(list("transformToDf"), list("GRanges"))(data)
is_coord_genome(data)

Arguments

ArgumentDescription
dataa GRanges object. for function transformDfToGr it's data.frame.
xcharacter for variable as x axis used for transformation.
ycharacter for variable as y axis used for transformation.
ylimnumeric range to control the ylimits on tracks when 'y' information is involved.
space.skipnumeric values indicates skipped ratio of whole space, not skipped space is identical between each space.
radiusnumeric value, indicates radius when transform to a circle.
trackWidthnumeric value, for track width.
direction"clockwise" or "counterclockwise", for layout or transform direction to circle.
mulnumeric value, passed to expand_range function, to control margin of y in the track.
ninteger value, control interpolated points numbers.
linked.toa column name of GRanges object, indicate the linked line's end point which represented as a GRanges too..
link.funfunction used to generate linking lines.
seqnamescharacter or integer values for column name or index indicate variable mapped to seqnames, default NULL use "seqnames".
startcharacter or integer values for column name or index indicate variable mapped to start, default NULL use "start".
endcharacter or integer values for column name or index indicate variable mapped to end, default NULL use "end".
widthcharacter or integer values for column name or index indicate variable mapped to width, default NULL use "width".
strandcharacter or integer values for column name or index indicate variable mapped to strand, default NULL use "strand".
to.seqnamescharacter or integer values for column name or index indicate variable mapped to linked seqnames, default NULL , create GRanges without new GRanges attached as column. If this varialbe is not NULL , this mean you try to parse linked GRanges object.
to.startcharacter or integer values for column name or index indicate variable mapped to start of linked GRanges, default NULL use "to.start".
to.endcharacter or integer values for column name or index indicate variable mapped to end of linked GRanges, default NULL use "to.end".
to.widthcharacter or integer values for column name or index indicate variable mapped to width of linked GRanges, default NULL use "to.width".
to.strandcharacter or integer values for column name or index indicate variable mapped to strand, default NULL use "to.strand" or just use *.
chr.weightnumeric vectors which sum to <1, the names of vectors has to be matched with seqnames in seqinfo, and you can only specify part of the seqnames, other lengths of chromosomes will be assined proportionally to their seqlengths, for example, you could specify chr1 to be 0.5, so the chr1 will take half of the space and other chromosomes squeezed to take left of the space.

Value

A GRanges object, with calculated new variables, including ".circle.x" for transformed x position, ".circle.y" for transformed y position, ".circle.angle" for transformed angle.

Author

Tengfei Yin

Examples

library(biovizBase)
library(GenomicRanges)
set.seed(1)
gr1 <- GRanges("chr1", IRanges(start = as.integer(runif(20, 1, 2e9)),
width = 5))
gr2 <- GRanges("chr2", IRanges(start = as.integer(runif(20, 1, 2e9)),
width = 5))
gr <- c(gr1, gr2)
gr.t <- transformToGenome(gr, space.skip = 0.1)
is_coord_genome(gr.t)
transformToCircle(gr.t)
transformToRectInCircle(gr)
transformToSegInCircle(gr)
values(gr1)$to.gr <- gr2
transformToLinkInCircle(gr1, linked.to = "to.gr")
Link to this function

transformGRangesForEvenSpace()

Transform GRanges with New Coordinates

Description

For graphics, like linked plot, e.g. generated by qplotRangesLinkedToData function in package ggbio. we need to generate a new set of coordinates which is used for even spaced statistics track.

Usage

transformGRangesForEvenSpace(gr)

Arguments

ArgumentDescription
grA GRanges object.

Details

Most used internally for special graphics, like qplotRangesLinkedToData function in package ggbio.

Value

A GRanges object as passed in, with new column x.new which indicate the static track coordinates, in this way, we could map the new coordinates with the old one.

Author

Tengfei Yin

Examples

library(GenomicRanges)
gr <- GRanges("chr1", IRanges(seq(1,100, length.out = 10), width = 5))
library(biovizBase)
transformGRangesForEvenSpace(gr)