bioconductor v3.9.0 GSEABase
This package provides classes and methods to support Gene
Link to this section Summary
Functions
Class "CollectionType"
Collection Type Class Constructors
Gene set enrichment data structures and methods
Class "GeneColorSet"
Methods to Construct "GeneColorSet" Instances
Class "GeneIdentifierType"
Gene Identifier Class Constructors
Class "GeneSetCollection"
Methods to construct GeneSetCollection instances
Class "GeneSet"
Methods to construct GeneSet instances
Class "OBOCollection"
Methods for Displaying Detailed GeneSet Information
Read OBO-specified Gene Ontology Collections
Read and write gene sets from Broad or GMT formats
Methods for Function goSlim in Package GSEABase' ## Description These methods summarize the gene ontology terms implied by the
idSrcargument into the GO terms implied by the
slimCollectionargument. The summary takes identifiers in
idSrcand determines all GO terms that apply to the identifiers. This full list of GO terms are then classified for membership in each term in the
slimCollection. The resulting object is a data frame containing the terms of
slimCollectionas row labels, counts and frequencies of identifiers classified to each term, and an abbreviated term description. An identifier in
idSrccan expand to several GO terms, and the GO terms in
slimCollectioncan imply an overlapping hierarchy of terms. Thus the resulting summary can easily contain more counts than there are identifiers in
idSrc. ## Usage ```r goSlim(idSrc, slimCollection, ontology, ..., verbose=FALSE) ``` ## Arguments |Argument |Description| |------------- |----------------| |
idSrc| An argument determining the source of GO terms to be mapped to slim terms. The source might be a
GOCollectionof terms, or another object (e.g., ExpressionSet) for which the method can extract GO terms.| |
slimCollection| An argument containing the GO slim terms.| |
ontology| A character string naming the ontology to be consulted when identifying slim term hierarchies. One of MF (molecular function), BP (biological process), CC (cellular compartment).| |
...| Additional arguments passed to specific methods.| |
verbose` | Logical influencing whether messages (primarily missing GO terms arising during creation of the slim hierarchy) are reported.|
## Examples
r myIds <- c("GO:0016564", "GO:0003677", "GO:0004345", "GO:0008265", "GO:0003841", "GO:0030151", "GO:0006355", "GO:0009664", "GO:0006412", "GO:0015979", "GO:0006457", "GO:0005618", "GO:0005622", "GO:0005840", "GO:0015935", "GO:0000311") myCollection <- GOCollection(myIds) fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase") slim <- getOBOCollection(fl) goSlim(myCollection, slim, "MF") data(sample.ExpressionSet) goSlim(sample.ExpressionSet, slim, "MF", evidenceCode="TAS")
Methods for Constructing Incidence Matricies Between GeneSets
Methods for Function mapIdentifiers in Package GSEABase' ## Description These methods convert the genes identifiers of a gene set from one type to another, e.g., from [
EntrezIdentifier](#entrezidentifier) to [
AnnotationIdentifier](#annotationidentifier) . Methods can be called directly by the user; [
geneIdType<-](#geneidtype<-) provides similar functionality.
verbose=TRUE` produces warning messages when
maps between identifier types are not 1:1, or a map has to be
constructed on the fly (this situation does not apply when using the
DBI-based annotation packages).
Link to this section Functions
CollectionType_class()
Class "CollectionType"
Description
These classes provides a way to tag the origin of a
GeneSet
. Collection types can be used in manipulating
(e.g., selecting) sets, and can contain information specific to
particular sets (e.g., category
and subcategory
classifications of BroadCollection
.)
Seealso
CollectionType
consturctors; getBroadSets
for importing collections from the Broad (and sources).
Author
Martin Morgan Martin.Morgan@RoswellPark.org
Examples
names(getClass("CollectionType")@subclasses)
## Create a CollectionType and ask for its type
collectionType(ExpressionSetCollection())
## Read two GeneSets from a Broad XML file into a list, verify that
## they are both BroadCollection's. Category / subcategory information
## is unique to Broad collections.
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
sets <- getBroadSets(fl)
sapply(sets, collectionType)
## ExpressionSets are tagged with ExpressionSetCollection; there is no
## 'category' information.
data(sample.ExpressionSet)
gs <- GeneSet(sample.ExpressionSet[100:109],
setName="sample.GeneSet", setIdentifier="123")
collectionType(gs)
## GOCollections are created by reference to GO terms and evidenceCodes
GOCollection("GO:0005488")
## requires library(GO); EntrezIdentifers automatically created
GeneSet(GOCollection(c("GO:0005488", "GO:0019825"),
evidenceCode="IDA"))
CollectionType_constructors()
Collection Type Class Constructors
Description
These functions construct collection types. Collection types can be used in manipulating (e.g., selecting) sets, and can contain information specific to particular sets (e.g., 'category' and 'subcategory' classifications of 'BroadCollection'.)
Usage
NullCollection(...)
ComputedCollection(...)
ExpressionSetCollection(...)
ChrCollection(ids,...)
ChrlocCollection(ids,...)
KEGGCollection(ids,...)
MapCollection(ids,...)
OMIMCollection(ids,...)
PMIDCollection(ids,...)
PfamCollection(ids, ...)
PrositeCollection(ids, ...)
GOCollection(ids=character(0), evidenceCode="ANY", ontology="ANY", ..., err=FALSE)
OBOCollection(ids, evidenceCode="ANY", ontology="ANY", ...)
BroadCollection(category, subCategory=NA, ...)
Arguments
Argument | Description |
---|---|
category | (Required) Broad category, one of "c1" (postitional), "c2" (curated), "c3" (motif), "c4" (computational), "c5" (GO), "c6" (Oncogenic Pathway Activation Modules) "c7" (Immunologic Signatures), "h" (Hallmark). |
subCategory | (Optional) Sub-category; no controlled vocabulary. |
ids | (Optional) Character vector of identifiers (e.g., GO, KEGG, or PMID terms). |
evidenceCode | (Optional) Character vector of GO evidence codes to be included, or "ANY" (any identifier; the default). Evidence is a property of particular genes, rather than of the ontology, so evidenceCode is a convenient way of specifying how users of a GOCollection might restrict derived objects (as in done during create of a gene set from an expression set). |
ontology | (Optional) Character vector of GO ontology terms to be included, or "ANY" (any identifier; the default). Unlike evidence code, ontology membership is enforced when GOCollection gene sets are constructed. |
err | (Optional) logical scalar indicating whether non-existent GO terms signal an error ( TRUE ), or are silently ignored ( FALSE ). |
... | Additional arguments, usually none but see specific CollectionType classes for possibilities. |
Value
An object of the same class as the function name, initialized as appropriate for the collection.
Seealso
CollectionType ,
Author
Martin Morgan Martin.Morgan@RoswellPark.org
Examples
NullCollection()
## NullCollection when no collection type specified
collectionType(GeneSet())
collectionType(GeneSet(collectionType=GOCollection()))
## fl could be a url
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gs1 <- getBroadSets(fl)[[1]]
collectionType(gs1) # BroadCollection
## new BroadCollection, with different category
bc <- BroadCollection(category="c2")
## change collectionType of gs2
gs2 <- gs1
collectionType(gs2) <- NullCollection()
## OBOCollection
fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase")
getOBOCollection(fl, evidenceCode="TAS") # returns OBOCollection
OBOCollection(c("GO:0008967", "GO:0015119", "GO:0030372", "GO:0002732",
"GO:0048090"))
GSEABase_package()
Gene set enrichment data structures and methods
Description
This package provides classes and methods to support Gene Set Enrichment Analysis (GSEA). The GeneSet class provides a common data structure for representing gene sets. The GeneColorSet class allows genes in a set to be associated with phenotypes. The GeneSetCollection class facilitates grouping together a list of related gene sets. The GeneIdentifierType class hierarchy reflects how genes are represented (e.g., Entrez versus symbol) in the gene set. mapIdentifiers provides a way to convert identifiers in a set from one type to another. The CollectionType class hierarchy reflects how the gene set was made, and can order genes into distinct sets or collections.
Seealso
GeneSet , GeneColorSet GeneSetCollection
Author
Written by Martin Morgan, Seth Falcon, Robert Gentleman. Maintainer: Biocore Team c/o BioC user list bioconductor@stat.math.ethz.ch
Examples
example(GeneSet)
GeneColorSet_class()
Class "GeneColorSet"
Description
A GeneColorSet
extends GeneSet to allow
genes to be 'colored'. Coloring means that for a particular phenotype,
each gene has a color (e.g., expression levels "up", "down", or "unchanged")
and a phenotypic consequence (e.g., the phenotype is "enhanced" or
"reduced").
All operations on a GeneSet
can be applied to a
GeneColorSet
; coloring can also be accessed.
Seealso
GeneSet .
Author
Martin Morgan Martin.Morgan@RoswellPark.org
Examples
## Create a GeneColorSet from an ExpressionSet
data(sample.ExpressionSet)
gcs1 <- GeneColorSet(sample.ExpressionSet[100:109],
phenotype="imaginary")
gcs1
## or with color...
gcs2 <- GeneColorSet(sample.ExpressionSet[100:109],
phenotype="imaginary",
geneColor=factor(
rep(c("up", "down", "unchanged"),
length.out=10)),
phenotypeColor=factor(
rep(c("enhanced", "reduced"),
length.out=10)))
coloring(gcs2)
## recode geneColor of genes 1 and 4
coloring(gcs2)[c(1,4),"geneColor"] <- "down"
coloring(gcs2)
## reset, this time by gene name
coloring(gcs2)[c("31339_at", "31342_at"),"geneColor"] <- c("up", "up")
## usual 'factor' errors and warning apply:
coloring(gcs2)[c("31339_at", "31342_at"),"geneColor"] <- c("UP", "up")
gcs2[["31342_at"]]
try(gcs2[["31342_"]]) # no partial matching
gcs2$"31342" # 1 partial match ok
GeneColorSet_methods()
Methods to Construct "GeneColorSet" Instances
Description
GeneColorSet
is a generic for constructing gene color sets
(i.e., gene sets with "coloring" to indicate how features of genes and
phenotypes are associated).
Seealso
GeneIdentifierType_class()
Class "GeneIdentifierType"
Description
This class provides a way to tag the meaning of gene
symbols in a GeneSet
. For instance, a GeneSet
with gene
names derived from a Bioconductor annotation
package (e.g., via
ExpressionSet
) initially have a
GeneIdentifierType
of AnnotationIdentifier
.
Seealso
The example below lists GeneIdentifierType
classes defined in
this package; See the help pages of these classes for specific information.
Author
Martin Morgan Martin.Morgan@RoswellPark.org
Examples
names(getClass("GeneIdentifierType")@subclasses)
# create an AnnotationIdentifier, and ask it's type
geneIdType(AnnotationIdentifier(annotation="hgu95av2"))
# Construct a GeneSet from an ExpressionSet, using the 'annotation'
# field of ExpressionSet to recognize the genes as AnnotationType
data(sample.ExpressionSet)
gs <- GeneSet(sample.ExpressionSet[100:109],
setName="sample.GeneSet", setIdentifier="123")
geneIdType(gs) # AnnotationIdentifier
## Read a Broad set from the system (or a url), and discover their
## GeneIdentifierType
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
bsets <- getBroadSets(fl)
sapply(bsets, geneIdType)
## try to combine gene sets with different set types
try(gs & sets[[1]])
## Use the annotation package associated with the original
## ExpressionSet to map to EntrezIdentifier() ...
geneIdType(gs) <- EntrezIdentifier()
## ...and try again
gs & bsets[[1]]
## Another way to change annotation to Entrez (or other) ids
probeIds <- featureNames(sample.ExpressionSet)[100:109]
geneIds <- getEG(probeIds, "hgu95av2")
GeneSet(EntrezIdentifier(),
setName="sample.GeneSet2", setIdentifier="101",
geneIds=geneIds)
## Create a new identifier
setClass("FooIdentifier",
contains="GeneIdentifierType",
prototype=prototype(
type=new("ScalarCharacter", "Foo")))
## Create a constructor (optional)
FooIdentifier <- function() new("FooIdentifier")
geneIdType(FooIdentifier())
## tidy up
removeClass("FooIdentifier")
GeneIdentifierType_constructors()
Gene Identifier Class Constructors
Description
Gene identifier classes and functions are used to indicate what the
list of genes in a gene set represents (e.g., Entrez gene identifiers
are tagged with EntrezIdentifier()
, Bioconductor annotations with
AnnotationIdentifier()
).
Usage
NullIdentifier(annotation, ...)
EnzymeIdentifier(annotation, ...)
ENSEMBLIdentifier(annotation, ...)
GenenameIdentifier(annotation,...)
RefseqIdentifier(annotation,...)
SymbolIdentifier(annotation,...)
UnigeneIdentifier(annotation,...)
UniprotIdentifier(annotation,...)
EntrezIdentifier(annotation,...)
AnnotationIdentifier(annotation, ...)
AnnoOrEntrezIdentifier(annotation, ...)
Arguments
Argument | Description |
---|---|
annotation | An optional character string identifying the Bioconductor package from which the annotations are drawn, e.g., hgu95av2 , org.Hs.eg.db . Or an src_organism object, e.g. Organism.dplyr::src_organism(TxDb.Hsapiens.UCSC.hg38.knownGene) . |
list() | Additional arguments, usually none. |
Value
For all but AnnoOrEntrezIdentifier
, An object of the same class
as the function name, initialized as appropriate for the identifier.
For AnnoOrEntrezIdentifier
, either an
AnnotationIdentifier
or EntrezIdentifier
depending on
the argument. This requires that the corresponding chip- or organism
package be loaded, hence installed on the user's system.
Seealso
GeneIdentifierType -class for a description of the classes and methods using these objects.
Author
Martin Morgan Martin.Morgan@RoswellPark.org
Examples
NullIdentifier()
data(sample.ExpressionSet)
gs1 <- GeneSet(sample.ExpressionSet[100:109],
setName="sample1", setIdentifier="100")
geneIdType(gs1) # AnnotationIdentifier
geneIds <- featureNames(sample.ExpressionSet)[100:109]
gs2 <- GeneSet(geneIds=geneIds,
setName="sample1", setIdentifier="101")
geneIdType(gs2) # NullIdentifier, since no info about genes provided
## Convert...
ai <- AnnotationIdentifier(annotation(sample.ExpressionSet))
geneIdType(gs2) <- ai
geneIdType(gs2)
## ...or provide more explicit construction
gs3 <- GeneSet(geneIds=geneIds, type=ai,
setName="sample1", setIdentifier="102")
uprotIds <- c("Q9Y6Q1", "A6NJZ7", "Q9BXI6", "Q15035", "A1X283",
"P55957")
gs4 <- GeneSet(uprotIds, geneIdType=UniprotIdentifier())
geneIdType(gs4) # UniprotIdentifier
geneIds(mapIdentifiers(gs4, UnigeneIdentifier(annotation="org.Hs.eg")))
GeneSetCollection_class()
Class "GeneSetCollection"
Description
a GeneSetCollection
is a collection of related
GeneSet s. The collection can mix and match
different types of gene sets. Members of the collection are refered to
by the setName
s of each gene set.
Seealso
GeneSet , GeneColorSet .
Author
Martin Morgan Martin.Morgan@RoswellPark.org
Examples
gs1 <- GeneSet(setName="set1", setIdentifier="101")
gs2 <- GeneSet(setName="set2", setIdentifier="102")
## construct from indivdiual elements...
gsc <- GeneSetCollection(gs1, gs2)
## or from a list
gsc <- GeneSetCollection(list(gs1, gs2))
## 'names' are the setNames
names(gsc)
## a collection of a single gene set
gsc["set1"]
## a gene set
gsc[["set1"]]
## set names must be unique
try(GeneSetCollection(gs1, gs1))
try(gsc[c("set1", "set1")])
GeneSetCollection_methods()
Methods to construct GeneSetCollection instances
Description
Use GeneSetCollection
to construct a collection of gene sets
from GeneSet arguments, or a list of
GeneSet
s.
Usage
GeneSetCollection(object, ..., idType, setType)
Arguments
Argument | Description |
---|---|
object | An argument determining how the gene set collection will be created, as described in the methods section. |
... | Additional arugments for gene set collection construction, as described below. |
idType | An argument of class GeneIdentifierType , used to indicate how the geneIds will be represented. |
setType | An argument of class CollectionType , used to indicate how the collection is created. |
Seealso
GeneSetCollection -class
Examples
gs1 <- GeneSet(setName="set1", setIdentifier="101")
gs2 <- GeneSet(setName="set2", setIdentifier="102")
## construct from indivdiual elements...
gsc <- GeneSetCollection(gs1, gs2)
## or from a list
gsc <- GeneSetCollection(list(gs1, gs2))
## set names must be unique
try(GeneSetCollection(gs1, gs1))
data(sample.ExpressionSet)
gsc <- GeneSetCollection(sample.ExpressionSet[200:250],
setType = GOCollection())
## from KEGG identifiers, for example
library(KEGG.db)
lst <- head(as.list(KEGGEXTID2PATHID))
gsc <- GeneSetCollection(mapply(function(geneIds, keggId) {
GeneSet(geneIds, geneIdType=EntrezIdentifier(),
collectionType=KEGGCollection(keggId),
setName=keggId)
}, lst, names(lst)))
GeneSet_class()
Class "GeneSet"
Description
A GeneSet
contains a set of gene identifiers. Each gene set has a
geneIdType
, indicating how the gene identifiers should be interpreted
(e.g., as Entrez identifiers), and a collectionType
, indicating
the origin of the gene set (perhaps including additional information
about the set, as in the BroadCollection type).
Conversion between identifiers, subsetting, and logical (set)
operations can be performed. Relationships between genes and phenotype
in a GeneSet
can be summarized using coloring
to create
a GeneColorSet
. A GeneSet
can be exported to XML with
toBroadXML
.
Seealso
GeneColorSet CollectionType GeneIdentifierType
Author
Martin Morgan Martin.Morgan@RoswellPark.org
Examples
## Empty gene set
GeneSet()
## Gene set from ExpressionSet
data(sample.ExpressionSet)
gs1 <- GeneSet(sample.ExpressionSet[100:109])
## GeneSet from Broad XML; 'fl' could be a url
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gs2 <- getBroadSets(fl)[[1]] # actually, a list of two gene sets
## GeneSet from list of geneIds
geneIds <- geneIds(gs2) # any character vector would do
gs3 <- GeneSet(geneIds=geneIds)
## unspecified set type, so...
is(geneIdType(gs3), "NullIdentifier") == TRUE
## update set type to match encoding of identifiers
geneIdType(gs2)
geneIdType(gs3) <- SymbolIdentifier()
## Convert between set types; this consults the 'annotation'
## information encoded in the 'AnnotationIdentifier' set type and the
## corresponding annotation package.
gs4 <- gs1
geneIdType(gs4) <- EntrezIdentifier()
## logical (set) operations
gs5 <- GeneSet(sample.ExpressionSet[100:109], setName="subset1")
gs6 <- GeneSet(sample.ExpressionSet[105:114], setName="subset2")
## intersection: 5 'genes'; note the set name '(subset1 & subset2)'
gs5 & gs6
## union: 15 'genes'; note the set name
|gs5 | gs6|
## an identity
|gs7 <- gs5 | gs6|
|gs8 <- setdiff(gs5, gs6) | (gs5 & gs6) | setdiff(gs6, gs5)|
identical(geneIds(gs7), geneIds(gs8))
identical(gs7, gs8) == FALSE # gs7 and gs8 setNames differ
## output
tmp <- tempfile()
toBroadXML(gs2, tmp)
noquote(readLines(tmp))
## must be BroadCollection() collectionType
try(toBroadXML(gs1))
gs9 <- gs1
collectionType(gs9) <- BroadCollection()
toBroadXML(gs9, tmp)
unlink(tmp)
toBroadXML(gs9) # no connection --> character vector
## list of geneIds --> vector of Broad GENESET XML
gs10 <- getBroadSets(fl) # two sets
entries <- sapply(gs10, function(x) toBroadXML(x))
## list mapIdentifiers available for GeneSet
showMethods("mapIdentifiers", classes="GeneSet", inherit=FALSE)
GeneSet_methods()
Methods to construct GeneSet instances
Description
Use GeneSet
to construct gene sets from ExpressionSet
,
character vector, or other objects.
Usage
GeneSet(type, ..., setIdentifier=.uniqueIdentifier())
Arguments
Argument | Description |
---|---|
type | An argument determining how the gene set will be created, as described in the Methods section. |
setIdentifier | A ScalarCharacter or length-1 character vector uniquely identifying the set. |
... | Additional arguments for gene set construction. Methods have required arguments, as outlined below; additional arguments correspond to slot names GeneSet . |
Seealso
GeneSet-class
GeneColorSet-class
Examples
## Empty gene set
GeneSet()
## Gene set from ExpressionSet
data(sample.ExpressionSet)
gs1 <- GeneSet(sample.ExpressionSet[100:109])
## GeneSet from Broad XML; 'fl' could be a url
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gs2 <- getBroadSets(fl)[[1]] # actually, a list of two gene sets
## GeneSet from list of gene identifiers
geneIds <- geneIds(gs2) # any character vector would do
gs3 <- GeneSet(geneIds)
## unspecified set type, so...
is(geneIdType(gs3), "NullIdentifier") == TRUE
## update set type to match encoding of identifiers
geneIdType(gs2)
geneIdType(gs3) <- SymbolIdentifier()
## other ways of accomplishing the same
gs4 <- GeneSet(geneIds, geneIdType=SymbolIdentifier())
gs5 <- GeneSet(SymbolIdentifier(), geneIds=geneIds)
OBOCollection_class()
Class "OBOCollection"
Description
OBOCollection
extends the GOCollection
class, and
is usually constructed from a file formated following the OBO file
format. See CollectionType for general use of
collections with gene sets.
Seealso
OBOCollection
constructor;
CollectionType classes.
Author
Martin Morgan Martin.Morgan@RoswellPark.org
References
http://www.geneontology.org for details of the OBO format.
Examples
fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase")
obo <- getOBOCollection(fl)
obo
subsets(obo)
obo["goslim_plant", evidenceCode="TAS"]
g <- as(obo["goslim_goa"], "graphNEL")
if (interactive() && require("Rgraphviz")) {
plot(g)
}
details_methods()
Methods for Displaying Detailed GeneSet Information
Description
This generic and methods supplement show
, providing more detail
on object contents.
getOBOCollection()
Read OBO-specified Gene Ontology Collections
Description
getOBOCollection
parses a uri (file or internet location)
encoded following the OBO specification defined by the Gene Onotology
consortium.
Usage
getOBOCollection(uri, evidenceCode="ANY", ...)
Arguments
Argument | Description |
---|---|
uri | A file name or URL containing gene sets encoded following the OBO specification. |
evidenceCode | A character vector of evidence codes. |
list() | Further arguments passed to the OBOCollection constructor. |
Value
getOBOCollection
returns an OBOCollection
of gene
sets. The gene set is constructed by parsing the file for id
tags in TERM
stanzas. The parser does not currently support all
features of OBO, e.g., the ability to import additional files.
Seealso
OBOCollection , OBOCollection
Author
Martin Morgan mtmrogan@fhcrc.org
References
Examples
## 'fl' could also be a URI
fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase")
getOBOCollection(fl) # GeneSetCollection of 2 sets
## Download from the internet
fl <- "http://www.geneontology.org/GO_slims/goslim_plant.obo"
getOBOCollection(fl, evidenceCode="TAS")
getObjects()
Read and write gene sets from Broad or GMT formats
Description
getBroadSets
parses one or more XML files for gene sets. The
file can reside locally or at a URL. The format followed is that
defined by the Broad (below). toBroadXML
creates Broad XML
from BroadCollection
gene sets.
toGmt
converts GeneSetColletion
objects to a character
vector representing the gene set collection in GMT
format. getGmt
reads a GMT file or other character vector into a
GeneSetColletion
.
Usage
getBroadSets(uri, ..., membersId=c("MEMBERS_SYMBOLIZED", "MEMBERS_EZID"))
toBroadXML(geneSet, con, ...)
asBroadUri(name,
base="http://www.broad.mit.edu/gsea/msigdb/cards")
getGmt(con, geneIdType=NullIdentifier(),
collectionType=NullCollection(), sep=" ", ...)
toGmt(x, con, ...)
Arguments
Argument | Description |
---|---|
uri | A file name or URL containing gene sets encoded following the Broad specification. For Broad sets, the uri can point to a MSIGDB. |
geneSet | A GeneSet with collectionType BroadCollection (to ensure that required information is available). |
x | A GeneSetCollection or other object for which a toGmt method is defined. |
con | A (optional, in the case of toXxx ) file name or connection to receive output. |
name | A character vector of Broad gene set names, e.g., c('chr16q', 'GNF2_TNFSF10') . |
base | Base uri for finding Broad gene sets. |
geneIdType | A constructor for the type of identifier the members of the gene sets represent. See GeneIdentifierType for more information. |
collectionType | A constructor for the type of collection for the gene sets. See CollectionType for more information. |
sep | The character string separating members of each gene set in the GMT file. |
list() | Further arguments passed to the underlying XML parser, particularly file used to specify an output connection for toBroadXML . |
membersId | XML field name from which geneIds are derived. Choose one value; default MEMBERS_SYMBOLIZED . |
Value
getBroadSets
returns a GeneSetCollection
of gene sets.
toBroadXML
returns a character vector of a single
GeneSet
or, if con
is provided, writes the XML to a
file.
asBroadUri
can be used to create URI names (to be used by
getBroadSets
of Broad files.
getGmt
returns a GeneSetCollection
of gene sets.
toGmt
returns character vectors where each line represents a
gene set. If con
is provided, the result is written to the
specified connection.
Seealso
GeneSetCollection GeneSet
Note
Actual Broad XML files differ from the DTD (e.g., an implied ',' separator between genes in a set); we parse to and from files as they exists the actual files.
Author
Martin Morgan mtmrogan@fhcrc.org
References
http://www.broad.mit.edu/gsea/
Examples
## 'fl' could also be a URI
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gss <- getBroadSets(fl) # GeneSetCollection of 2 sets
names(gss)
gss[[1]]
## Download 'msigdb_v2.5.xml' or 'c3.all.v2.5.symbols.gmt' from the
## Broad, http://www.broad.mit.edu/gsea/downloads.jsp#msigdb, then
gsc <- getBroadSets("/path/to/msigdb_v.2.5.xml")
types <- sapply(gsc, function(elt) bcCategory(collectionType(elt)))
c3gsc1 <- gsc[types == "c3"]
c3gsc2 <- getGmt("/path/to/c3.all.v2.5.symbols.gmt",
collectionType=BroadCollection(category="c3"),
geneIdType=SymbolIdentifier())
fl <- tempfile()
toBroadXML(gss[[1]], con=fl)
noquote(readLines(fl))
unlink(fl)
toBroadXML(gss[[1]]) # character vector
fl <- tempfile()
toGmt(gss, fl)
getGmt(fl)
unlink(fl)
goSlim_methods()
Methods for Function goSlim in Package GSEABase' ## Description These methods summarize the gene ontology terms implied by the
idSrcargument into the GO terms implied by the
slimCollectionargument. The summary takes identifiers in
idSrcand determines all GO terms that apply to the identifiers. This full list of GO terms are then classified for membership in each term in the
slimCollection. The resulting object is a data frame containing the terms of
slimCollectionas row labels, counts and frequencies of identifiers classified to each term, and an abbreviated term description. An identifier in
idSrccan expand to several GO terms, and the GO terms in
slimCollectioncan imply an overlapping hierarchy of terms. Thus the resulting summary can easily contain more counts than there are identifiers in
idSrc. ## Usage ```r goSlim(idSrc, slimCollection, ontology, ..., verbose=FALSE) ``` ## Arguments |Argument |Description| |------------- |----------------| |
idSrc| An argument determining the source of GO terms to be mapped to slim terms. The source might be a
GOCollectionof terms, or another object (e.g., ExpressionSet) for which the method can extract GO terms.| |
slimCollection| An argument containing the GO slim terms.| |
ontology| A character string naming the ontology to be consulted when identifying slim term hierarchies. One of MF (molecular function), BP (biological process), CC (cellular compartment).| |
...| Additional arguments passed to specific methods.| |
verbose` | Logical influencing whether messages (primarily missing GO terms arising during creation of the slim hierarchy) are reported.|
## Examples
r myIds <- c("GO:0016564", "GO:0003677", "GO:0004345", "GO:0008265", "GO:0003841", "GO:0030151", "GO:0006355", "GO:0009664", "GO:0006412", "GO:0015979", "GO:0006457", "GO:0005618", "GO:0005622", "GO:0005840", "GO:0015935", "GO:0000311") myCollection <- GOCollection(myIds) fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase") slim <- getOBOCollection(fl) goSlim(myCollection, slim, "MF") data(sample.ExpressionSet) goSlim(sample.ExpressionSet, slim, "MF", evidenceCode="TAS")
incidence_methods()
Methods for Constructing Incidence Matricies Between GeneSets
Description
An incidence matrix summarizes shared membership of gene identifiers across (pairs of) gene sets.
Examples
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gss <- getBroadSets(fl) # GeneSetCollection of 2 sets
## From one or more GeneSetCollections...
imat <- incidence(gss)
dim(imat)
imat[,c(1:3,ncol(imat)-3+1:3)]
## .. or GeneSets
imat1 <- incidence(gss[[1]], gss[[2]], gss[[1]])
imat1[,1:5]
mapIdentifiers_methods()
Methods for Function mapIdentifiers in Package GSEABase' ## Description These methods convert the genes identifiers of a gene set from one type to another, e.g., from [
EntrezIdentifier](#entrezidentifier) to [
AnnotationIdentifier](#annotationidentifier) . Methods can be called directly by the user; [
geneIdType<-](#geneidtype<-) provides similar functionality.
verbose=TRUE` produces warning messages when
maps between identifier types are not 1:1, or a map has to be
constructed on the fly (this situation does not apply when using the
DBI-based annotation packages).