bioconductor v3.9.0 GSEABase

This package provides classes and methods to support Gene

Link to this section Summary

Functions

Class "CollectionType"

Collection Type Class Constructors

Gene set enrichment data structures and methods

Class "GeneColorSet"

Methods to Construct "GeneColorSet" Instances

Class "GeneIdentifierType"

Gene Identifier Class Constructors

Class "GeneSetCollection"

Methods to construct GeneSetCollection instances

Class "GeneSet"

Methods to construct GeneSet instances

Class "OBOCollection"

Methods for Displaying Detailed GeneSet Information

Read OBO-specified Gene Ontology Collections

Read and write gene sets from Broad or GMT formats

Methods for Function goSlim in Package GSEABase' ## Description These methods summarize the gene ontology terms implied by theidSrcargument into the GO terms implied by theslimCollectionargument. The summary takes identifiers inidSrcand determines all GO terms that apply to the identifiers. This full list of GO terms are then classified for membership in each term in theslimCollection. The resulting object is a data frame containing the terms ofslimCollectionas row labels, counts and frequencies of identifiers classified to each term, and an abbreviated term description. An identifier inidSrccan expand to several GO terms, and the GO terms inslimCollectioncan imply an overlapping hierarchy of terms. Thus the resulting summary can easily contain more counts than there are identifiers inidSrc. ## Usage ```r goSlim(idSrc, slimCollection, ontology, ..., verbose=FALSE) ``` ## Arguments |Argument |Description| |------------- |----------------| |idSrc| An argument determining the source of GO terms to be mapped to slim terms. The source might be aGOCollectionof terms, or another object (e.g., ExpressionSet) for which the method can extract GO terms.| |slimCollection| An argument containing the GO slim terms.| |ontology| A character string naming the ontology to be consulted when identifying slim term hierarchies. One of MF (molecular function), BP (biological process), CC (cellular compartment).| |...| Additional arguments passed to specific methods.| |verbose` | Logical influencing whether messages (primarily missing GO terms arising during creation of the slim hierarchy) are reported.| ## Examples r myIds <- c("GO:0016564", "GO:0003677", "GO:0004345", "GO:0008265", "GO:0003841", "GO:0030151", "GO:0006355", "GO:0009664", "GO:0006412", "GO:0015979", "GO:0006457", "GO:0005618", "GO:0005622", "GO:0005840", "GO:0015935", "GO:0000311") myCollection <- GOCollection(myIds) fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase") slim <- getOBOCollection(fl) goSlim(myCollection, slim, "MF") data(sample.ExpressionSet) goSlim(sample.ExpressionSet, slim, "MF", evidenceCode="TAS")

Methods for Constructing Incidence Matricies Between GeneSets

Methods for Function mapIdentifiers in Package GSEABase' ## Description These methods convert the genes identifiers of a gene set from one type to another, e.g., from [EntrezIdentifier](#entrezidentifier) to [AnnotationIdentifier](#annotationidentifier) . Methods can be called directly by the user; [geneIdType<-](#geneidtype<-) provides similar functionality.verbose=TRUE` produces warning messages when maps between identifier types are not 1:1, or a map has to be constructed on the fly (this situation does not apply when using the DBI-based annotation packages).

Link to this section Functions

Link to this function

CollectionType_class()

Class "CollectionType"

Description

These classes provides a way to tag the origin of a GeneSet . Collection types can be used in manipulating (e.g., selecting) sets, and can contain information specific to particular sets (e.g., category and subcategory classifications of BroadCollection .)

Seealso

CollectionType consturctors; getBroadSets for importing collections from the Broad (and sources).

Author

Martin Morgan Martin.Morgan@RoswellPark.org

Examples

names(getClass("CollectionType")@subclasses)

## Create a CollectionType and ask for its type
collectionType(ExpressionSetCollection())

## Read two GeneSets from a Broad XML file into a list, verify that
## they are both BroadCollection's. Category / subcategory information
## is unique to Broad collections.
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
sets <- getBroadSets(fl)
sapply(sets, collectionType)

## ExpressionSets are tagged with ExpressionSetCollection; there is no
## 'category' information.
data(sample.ExpressionSet)
gs <- GeneSet(sample.ExpressionSet[100:109],
setName="sample.GeneSet", setIdentifier="123")
collectionType(gs)

## GOCollections are created by reference to GO terms and evidenceCodes
GOCollection("GO:0005488")
## requires library(GO); EntrezIdentifers automatically created
GeneSet(GOCollection(c("GO:0005488", "GO:0019825"),
evidenceCode="IDA"))
Link to this function

CollectionType_constructors()

Collection Type Class Constructors

Description

These functions construct collection types. Collection types can be used in manipulating (e.g., selecting) sets, and can contain information specific to particular sets (e.g., 'category' and 'subcategory' classifications of 'BroadCollection'.)

Usage

NullCollection(...)
ComputedCollection(...)
ExpressionSetCollection(...)
ChrCollection(ids,...)
ChrlocCollection(ids,...)
KEGGCollection(ids,...)
MapCollection(ids,...)
OMIMCollection(ids,...)
PMIDCollection(ids,...)
PfamCollection(ids, ...)
PrositeCollection(ids, ...)
GOCollection(ids=character(0), evidenceCode="ANY", ontology="ANY", ..., err=FALSE)
OBOCollection(ids, evidenceCode="ANY", ontology="ANY", ...)
BroadCollection(category, subCategory=NA, ...)

Arguments

ArgumentDescription
category(Required) Broad category, one of "c1" (postitional), "c2" (curated), "c3" (motif), "c4" (computational), "c5" (GO), "c6" (Oncogenic Pathway Activation Modules) "c7" (Immunologic Signatures), "h" (Hallmark).
subCategory(Optional) Sub-category; no controlled vocabulary.
ids(Optional) Character vector of identifiers (e.g., GO, KEGG, or PMID terms).
evidenceCode(Optional) Character vector of GO evidence codes to be included, or "ANY" (any identifier; the default). Evidence is a property of particular genes, rather than of the ontology, so evidenceCode is a convenient way of specifying how users of a GOCollection might restrict derived objects (as in done during create of a gene set from an expression set).
ontology(Optional) Character vector of GO ontology terms to be included, or "ANY" (any identifier; the default). Unlike evidence code, ontology membership is enforced when GOCollection gene sets are constructed.
err(Optional) logical scalar indicating whether non-existent GO terms signal an error ( TRUE ), or are silently ignored ( FALSE ).
...Additional arguments, usually none but see specific CollectionType classes for possibilities.

Value

An object of the same class as the function name, initialized as appropriate for the collection.

Seealso

CollectionType ,

Author

Martin Morgan Martin.Morgan@RoswellPark.org

Examples

NullCollection()

## NullCollection when no collection type specified
collectionType(GeneSet())
collectionType(GeneSet(collectionType=GOCollection()))

## fl could be a url
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gs1 <- getBroadSets(fl)[[1]]
collectionType(gs1) # BroadCollection

## new BroadCollection, with different category
bc <- BroadCollection(category="c2")
## change collectionType of gs2
gs2 <- gs1
collectionType(gs2) <- NullCollection()

## OBOCollection
fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase")
getOBOCollection(fl, evidenceCode="TAS") # returns OBOCollection
OBOCollection(c("GO:0008967", "GO:0015119", "GO:0030372", "GO:0002732",
"GO:0048090"))
Link to this function

GSEABase_package()

Gene set enrichment data structures and methods

Description

This package provides classes and methods to support Gene Set Enrichment Analysis (GSEA). The GeneSet class provides a common data structure for representing gene sets. The GeneColorSet class allows genes in a set to be associated with phenotypes. The GeneSetCollection class facilitates grouping together a list of related gene sets. The GeneIdentifierType class hierarchy reflects how genes are represented (e.g., Entrez versus symbol) in the gene set. mapIdentifiers provides a way to convert identifiers in a set from one type to another. The CollectionType class hierarchy reflects how the gene set was made, and can order genes into distinct sets or collections.

Seealso

GeneSet , GeneColorSet GeneSetCollection

Author

Written by Martin Morgan, Seth Falcon, Robert Gentleman. Maintainer: Biocore Team c/o BioC user list bioconductor@stat.math.ethz.ch

Examples

example(GeneSet)
Link to this function

GeneColorSet_class()

Class "GeneColorSet"

Description

A GeneColorSet extends GeneSet to allow genes to be 'colored'. Coloring means that for a particular phenotype, each gene has a color (e.g., expression levels "up", "down", or "unchanged") and a phenotypic consequence (e.g., the phenotype is "enhanced" or "reduced").

All operations on a GeneSet can be applied to a GeneColorSet ; coloring can also be accessed.

Seealso

GeneSet .

Author

Martin Morgan Martin.Morgan@RoswellPark.org

Examples

## Create a GeneColorSet from an ExpressionSet
data(sample.ExpressionSet)
gcs1 <- GeneColorSet(sample.ExpressionSet[100:109],
phenotype="imaginary")
gcs1
## or with color...
gcs2 <- GeneColorSet(sample.ExpressionSet[100:109],
phenotype="imaginary",
geneColor=factor(
rep(c("up", "down", "unchanged"),
length.out=10)),
phenotypeColor=factor(
rep(c("enhanced", "reduced"),
length.out=10)))
coloring(gcs2)

## recode geneColor of genes 1 and 4
coloring(gcs2)[c(1,4),"geneColor"] <- "down"
coloring(gcs2)
## reset, this time by gene name
coloring(gcs2)[c("31339_at", "31342_at"),"geneColor"] <- c("up", "up")
## usual 'factor' errors and warning apply:
coloring(gcs2)[c("31339_at", "31342_at"),"geneColor"] <- c("UP", "up")

gcs2[["31342_at"]]
try(gcs2[["31342_"]]) # no partial matching
gcs2$"31342" # 1 partial match ok
Link to this function

GeneColorSet_methods()

Methods to Construct "GeneColorSet" Instances

Description

GeneColorSet is a generic for constructing gene color sets (i.e., gene sets with "coloring" to indicate how features of genes and phenotypes are associated).

Seealso

GeneColorSet-class

Link to this function

GeneIdentifierType_class()

Class "GeneIdentifierType"

Description

This class provides a way to tag the meaning of gene symbols in a GeneSet . For instance, a GeneSet with gene names derived from a Bioconductor annotation package (e.g., via ExpressionSet ) initially have a GeneIdentifierType of AnnotationIdentifier .

Seealso

The example below lists GeneIdentifierType classes defined in this package; See the help pages of these classes for specific information.

Author

Martin Morgan Martin.Morgan@RoswellPark.org

Examples

names(getClass("GeneIdentifierType")@subclasses)

# create an AnnotationIdentifier, and ask it's type
geneIdType(AnnotationIdentifier(annotation="hgu95av2"))

# Construct a GeneSet from an ExpressionSet, using the 'annotation'
# field of ExpressionSet to recognize the genes as AnnotationType
data(sample.ExpressionSet)
gs <- GeneSet(sample.ExpressionSet[100:109],
setName="sample.GeneSet", setIdentifier="123")
geneIdType(gs) # AnnotationIdentifier

## Read a Broad set from the system (or a url), and discover their
## GeneIdentifierType
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
bsets <- getBroadSets(fl)
sapply(bsets, geneIdType)

## try to combine gene sets with different set types
try(gs & sets[[1]])

## Use the annotation package associated with the original
## ExpressionSet to map to EntrezIdentifier() ...
geneIdType(gs) <- EntrezIdentifier()
## ...and try again
gs & bsets[[1]]

## Another way to change annotation to Entrez (or other) ids
probeIds <- featureNames(sample.ExpressionSet)[100:109]
geneIds <- getEG(probeIds, "hgu95av2")
GeneSet(EntrezIdentifier(),
setName="sample.GeneSet2", setIdentifier="101",
geneIds=geneIds)

## Create a new identifier
setClass("FooIdentifier",
contains="GeneIdentifierType",
prototype=prototype(
type=new("ScalarCharacter", "Foo")))
## Create a constructor (optional)
FooIdentifier <- function() new("FooIdentifier")
geneIdType(FooIdentifier())

## tidy up
removeClass("FooIdentifier")
Link to this function

GeneIdentifierType_constructors()

Gene Identifier Class Constructors

Description

Gene identifier classes and functions are used to indicate what the list of genes in a gene set represents (e.g., Entrez gene identifiers are tagged with EntrezIdentifier() , Bioconductor annotations with AnnotationIdentifier() ).

Usage

NullIdentifier(annotation, ...)
EnzymeIdentifier(annotation, ...)
ENSEMBLIdentifier(annotation, ...)
GenenameIdentifier(annotation,...)
RefseqIdentifier(annotation,...)
SymbolIdentifier(annotation,...)
UnigeneIdentifier(annotation,...)
UniprotIdentifier(annotation,...)
EntrezIdentifier(annotation,...)
AnnotationIdentifier(annotation, ...)
AnnoOrEntrezIdentifier(annotation, ...)

Arguments

ArgumentDescription
annotationAn optional character string identifying the Bioconductor package from which the annotations are drawn, e.g., hgu95av2 , org.Hs.eg.db . Or an src_organism object, e.g. Organism.dplyr::src_organism(TxDb.Hsapiens.UCSC.hg38.knownGene) .
list()Additional arguments, usually none.

Value

For all but AnnoOrEntrezIdentifier , An object of the same class as the function name, initialized as appropriate for the identifier.

For AnnoOrEntrezIdentifier , either an AnnotationIdentifier or EntrezIdentifier depending on the argument. This requires that the corresponding chip- or organism package be loaded, hence installed on the user's system.

Seealso

GeneIdentifierType -class for a description of the classes and methods using these objects.

Author

Martin Morgan Martin.Morgan@RoswellPark.org

Examples

NullIdentifier()

data(sample.ExpressionSet)
gs1 <- GeneSet(sample.ExpressionSet[100:109],
setName="sample1", setIdentifier="100")
geneIdType(gs1) # AnnotationIdentifier

geneIds <- featureNames(sample.ExpressionSet)[100:109]
gs2 <- GeneSet(geneIds=geneIds,
setName="sample1", setIdentifier="101")
geneIdType(gs2) # NullIdentifier, since no info about genes provided

## Convert...
ai <- AnnotationIdentifier(annotation(sample.ExpressionSet))
geneIdType(gs2) <- ai
geneIdType(gs2)
## ...or provide more explicit construction
gs3 <- GeneSet(geneIds=geneIds, type=ai,
setName="sample1", setIdentifier="102")

uprotIds <- c("Q9Y6Q1", "A6NJZ7", "Q9BXI6", "Q15035", "A1X283",
"P55957")
gs4 <- GeneSet(uprotIds, geneIdType=UniprotIdentifier())
geneIdType(gs4) # UniprotIdentifier
geneIds(mapIdentifiers(gs4, UnigeneIdentifier(annotation="org.Hs.eg")))
Link to this function

GeneSetCollection_class()

Class "GeneSetCollection"

Description

a GeneSetCollection is a collection of related GeneSet s. The collection can mix and match different types of gene sets. Members of the collection are refered to by the setName s of each gene set.

Seealso

GeneSet , GeneColorSet .

Author

Martin Morgan Martin.Morgan@RoswellPark.org

Examples

gs1 <- GeneSet(setName="set1", setIdentifier="101")
gs2 <- GeneSet(setName="set2", setIdentifier="102")

## construct from indivdiual elements...
gsc <- GeneSetCollection(gs1, gs2)
## or from a list
gsc <- GeneSetCollection(list(gs1, gs2))

## 'names' are the setNames
names(gsc)

## a collection of a single gene set
gsc["set1"]
## a gene set
gsc[["set1"]]

## set names must be unique
try(GeneSetCollection(gs1, gs1))
try(gsc[c("set1", "set1")])
Link to this function

GeneSetCollection_methods()

Methods to construct GeneSetCollection instances

Description

Use GeneSetCollection to construct a collection of gene sets from GeneSet arguments, or a list of GeneSet s.

Usage

GeneSetCollection(object, ..., idType, setType)

Arguments

ArgumentDescription
objectAn argument determining how the gene set collection will be created, as described in the methods section.
...Additional arugments for gene set collection construction, as described below.
idTypeAn argument of class GeneIdentifierType , used to indicate how the geneIds will be represented.
setTypeAn argument of class CollectionType , used to indicate how the collection is created.

Seealso

GeneSetCollection -class

Examples

gs1 <- GeneSet(setName="set1", setIdentifier="101")
gs2 <- GeneSet(setName="set2", setIdentifier="102")

## construct from indivdiual elements...
gsc <- GeneSetCollection(gs1, gs2)
## or from a list
gsc <- GeneSetCollection(list(gs1, gs2))

## set names must be unique
try(GeneSetCollection(gs1, gs1))

data(sample.ExpressionSet)
gsc <- GeneSetCollection(sample.ExpressionSet[200:250],
setType = GOCollection())

## from KEGG identifiers, for example
library(KEGG.db)
lst <- head(as.list(KEGGEXTID2PATHID))
gsc <- GeneSetCollection(mapply(function(geneIds, keggId) {
GeneSet(geneIds, geneIdType=EntrezIdentifier(),
collectionType=KEGGCollection(keggId),
setName=keggId)
}, lst, names(lst)))
Link to this function

GeneSet_class()

Class "GeneSet"

Description

A GeneSet contains a set of gene identifiers. Each gene set has a geneIdType , indicating how the gene identifiers should be interpreted (e.g., as Entrez identifiers), and a collectionType , indicating the origin of the gene set (perhaps including additional information about the set, as in the BroadCollection type).

Conversion between identifiers, subsetting, and logical (set) operations can be performed. Relationships between genes and phenotype in a GeneSet can be summarized using coloring to create a GeneColorSet . A GeneSet can be exported to XML with toBroadXML .

Seealso

GeneColorSet CollectionType GeneIdentifierType

Author

Martin Morgan Martin.Morgan@RoswellPark.org

Examples

## Empty gene set
GeneSet()
## Gene set from ExpressionSet
data(sample.ExpressionSet)
gs1 <- GeneSet(sample.ExpressionSet[100:109])
## GeneSet from Broad XML; 'fl' could be a url
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gs2 <- getBroadSets(fl)[[1]] # actually, a list of two gene sets
## GeneSet from list of geneIds
geneIds <- geneIds(gs2) # any character vector would do
gs3 <- GeneSet(geneIds=geneIds)
## unspecified set type, so...
is(geneIdType(gs3), "NullIdentifier") == TRUE
## update set type to match encoding of identifiers
geneIdType(gs2)
geneIdType(gs3) <- SymbolIdentifier()

## Convert between set types; this consults the 'annotation'
## information encoded in the 'AnnotationIdentifier' set type and the
## corresponding annotation package.
gs4 <- gs1
geneIdType(gs4) <- EntrezIdentifier()

## logical (set) operations
gs5 <- GeneSet(sample.ExpressionSet[100:109], setName="subset1")
gs6 <- GeneSet(sample.ExpressionSet[105:114], setName="subset2")
## intersection: 5 'genes'; note the set name '(subset1 & subset2)'
gs5 & gs6
## union: 15 'genes'; note the set name
|gs5 | gs6|
## an identity
|gs7 <- gs5 | gs6|
|gs8 <- setdiff(gs5, gs6) | (gs5 & gs6) | setdiff(gs6, gs5)|
identical(geneIds(gs7), geneIds(gs8))
identical(gs7, gs8) == FALSE # gs7 and gs8 setNames differ

## output
tmp <- tempfile()
toBroadXML(gs2, tmp)
noquote(readLines(tmp))
## must be BroadCollection() collectionType
try(toBroadXML(gs1))
gs9 <- gs1
collectionType(gs9) <- BroadCollection()
toBroadXML(gs9, tmp)
unlink(tmp)
toBroadXML(gs9) # no connection --> character vector
## list of geneIds --> vector of Broad GENESET XML
gs10 <- getBroadSets(fl) # two sets
entries <- sapply(gs10, function(x) toBroadXML(x))

## list mapIdentifiers available for GeneSet
showMethods("mapIdentifiers", classes="GeneSet", inherit=FALSE)
Link to this function

GeneSet_methods()

Methods to construct GeneSet instances

Description

Use GeneSet to construct gene sets from ExpressionSet , character vector, or other objects.

Usage

GeneSet(type, ..., setIdentifier=.uniqueIdentifier())

Arguments

ArgumentDescription
typeAn argument determining how the gene set will be created, as described in the Methods section.
setIdentifierA ScalarCharacter or length-1 character vector uniquely identifying the set.
...Additional arguments for gene set construction. Methods have required arguments, as outlined below; additional arguments correspond to slot names GeneSet .

Seealso

GeneSet-class GeneColorSet-class

Examples

## Empty gene set
GeneSet()

## Gene set from ExpressionSet
data(sample.ExpressionSet)
gs1 <- GeneSet(sample.ExpressionSet[100:109])

## GeneSet from Broad XML; 'fl' could be a url
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gs2 <- getBroadSets(fl)[[1]] # actually, a list of two gene sets

## GeneSet from list of gene identifiers
geneIds <- geneIds(gs2) # any character vector would do
gs3 <- GeneSet(geneIds)
## unspecified set type, so...
is(geneIdType(gs3), "NullIdentifier") == TRUE
## update set type to match encoding of identifiers
geneIdType(gs2)
geneIdType(gs3) <- SymbolIdentifier()
## other ways of accomplishing the same
gs4 <- GeneSet(geneIds, geneIdType=SymbolIdentifier())
gs5 <- GeneSet(SymbolIdentifier(), geneIds=geneIds)
Link to this function

OBOCollection_class()

Class "OBOCollection"

Description

OBOCollection extends the GOCollection class, and is usually constructed from a file formated following the OBO file format. See CollectionType for general use of collections with gene sets.

Seealso

OBOCollection constructor; CollectionType classes.

Author

Martin Morgan Martin.Morgan@RoswellPark.org

References

http://www.geneontology.org for details of the OBO format.

Examples

fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase")
obo <- getOBOCollection(fl)
obo
subsets(obo)
obo["goslim_plant", evidenceCode="TAS"]
g <- as(obo["goslim_goa"], "graphNEL")
if (interactive() && require("Rgraphviz")) {
plot(g)
}
Link to this function

details_methods()

Methods for Displaying Detailed GeneSet Information

Description

This generic and methods supplement show , providing more detail on object contents.

Link to this function

getOBOCollection()

Read OBO-specified Gene Ontology Collections

Description

getOBOCollection parses a uri (file or internet location) encoded following the OBO specification defined by the Gene Onotology consortium.

Usage

getOBOCollection(uri, evidenceCode="ANY", ...)

Arguments

ArgumentDescription
uriA file name or URL containing gene sets encoded following the OBO specification.
evidenceCodeA character vector of evidence codes.
list()Further arguments passed to the OBOCollection constructor.

Value

getOBOCollection returns an OBOCollection of gene sets. The gene set is constructed by parsing the file for id tags in TERM stanzas. The parser does not currently support all features of OBO, e.g., the ability to import additional files.

Seealso

OBOCollection , OBOCollection

Author

Martin Morgan mtmrogan@fhcrc.org

References

http://www.geneontology.org

Examples

## 'fl' could also be a URI
fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase")
getOBOCollection(fl) # GeneSetCollection of 2 sets

## Download from the internet
fl <- "http://www.geneontology.org/GO_slims/goslim_plant.obo"
getOBOCollection(fl, evidenceCode="TAS")

Read and write gene sets from Broad or GMT formats

Description

getBroadSets parses one or more XML files for gene sets. The file can reside locally or at a URL. The format followed is that defined by the Broad (below). toBroadXML creates Broad XML from BroadCollection gene sets.

toGmt converts GeneSetColletion objects to a character vector representing the gene set collection in GMT format. getGmt reads a GMT file or other character vector into a GeneSetColletion .

Usage

getBroadSets(uri, ..., membersId=c("MEMBERS_SYMBOLIZED", "MEMBERS_EZID"))
toBroadXML(geneSet, con, ...)
asBroadUri(name,
           base="http://www.broad.mit.edu/gsea/msigdb/cards")
getGmt(con, geneIdType=NullIdentifier(),
       collectionType=NullCollection(), sep="   ", ...)
toGmt(x, con, ...)

Arguments

ArgumentDescription
uriA file name or URL containing gene sets encoded following the Broad specification. For Broad sets, the uri can point to a MSIGDB.
geneSetA GeneSet with collectionType BroadCollection (to ensure that required information is available).
xA GeneSetCollection or other object for which a toGmt method is defined.
conA (optional, in the case of toXxx ) file name or connection to receive output.
nameA character vector of Broad gene set names, e.g., c('chr16q', 'GNF2_TNFSF10') .
baseBase uri for finding Broad gene sets.
geneIdTypeA constructor for the type of identifier the members of the gene sets represent. See GeneIdentifierType for more information.
collectionTypeA constructor for the type of collection for the gene sets. See CollectionType for more information.
sepThe character string separating members of each gene set in the GMT file.
list()Further arguments passed to the underlying XML parser, particularly file used to specify an output connection for toBroadXML .
membersIdXML field name from which geneIds are derived. Choose one value; default MEMBERS_SYMBOLIZED .

Value

getBroadSets returns a GeneSetCollection of gene sets.

toBroadXML returns a character vector of a single GeneSet or, if con is provided, writes the XML to a file.

asBroadUri can be used to create URI names (to be used by getBroadSets of Broad files.

getGmt returns a GeneSetCollection of gene sets.

toGmt returns character vectors where each line represents a gene set. If con is provided, the result is written to the specified connection.

Seealso

GeneSetCollection GeneSet

Note

Actual Broad XML files differ from the DTD (e.g., an implied ',' separator between genes in a set); we parse to and from files as they exists the actual files.

Author

Martin Morgan mtmrogan@fhcrc.org

References

http://www.broad.mit.edu/gsea/

Examples

## 'fl' could also be a URI
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gss <- getBroadSets(fl) # GeneSetCollection of 2 sets
names(gss)
gss[[1]]

## Download 'msigdb_v2.5.xml' or 'c3.all.v2.5.symbols.gmt' from the
## Broad, http://www.broad.mit.edu/gsea/downloads.jsp#msigdb, then
gsc <- getBroadSets("/path/to/msigdb_v.2.5.xml")
types <- sapply(gsc, function(elt) bcCategory(collectionType(elt)))
c3gsc1 <- gsc[types == "c3"]
c3gsc2 <- getGmt("/path/to/c3.all.v2.5.symbols.gmt",
collectionType=BroadCollection(category="c3"),
geneIdType=SymbolIdentifier())

fl <- tempfile()
toBroadXML(gss[[1]], con=fl)
noquote(readLines(fl))
unlink(fl)

toBroadXML(gss[[1]]) # character vector

fl <- tempfile()
toGmt(gss, fl)
getGmt(fl)
unlink(fl)
Link to this function

goSlim_methods()

Methods for Function goSlim in Package GSEABase' ## Description These methods summarize the gene ontology terms implied by theidSrcargument into the GO terms implied by theslimCollectionargument. The summary takes identifiers inidSrcand determines all GO terms that apply to the identifiers. This full list of GO terms are then classified for membership in each term in theslimCollection. The resulting object is a data frame containing the terms ofslimCollectionas row labels, counts and frequencies of identifiers classified to each term, and an abbreviated term description. An identifier inidSrccan expand to several GO terms, and the GO terms inslimCollectioncan imply an overlapping hierarchy of terms. Thus the resulting summary can easily contain more counts than there are identifiers inidSrc. ## Usage ```r goSlim(idSrc, slimCollection, ontology, ..., verbose=FALSE) ``` ## Arguments |Argument |Description| |------------- |----------------| |idSrc| An argument determining the source of GO terms to be mapped to slim terms. The source might be aGOCollectionof terms, or another object (e.g., ExpressionSet) for which the method can extract GO terms.| |slimCollection| An argument containing the GO slim terms.| |ontology| A character string naming the ontology to be consulted when identifying slim term hierarchies. One of MF (molecular function), BP (biological process), CC (cellular compartment).| |...| Additional arguments passed to specific methods.| |verbose` | Logical influencing whether messages (primarily missing GO terms arising during creation of the slim hierarchy) are reported.| ## Examples r myIds <- c("GO:0016564", "GO:0003677", "GO:0004345", "GO:0008265", "GO:0003841", "GO:0030151", "GO:0006355", "GO:0009664", "GO:0006412", "GO:0015979", "GO:0006457", "GO:0005618", "GO:0005622", "GO:0005840", "GO:0015935", "GO:0000311") myCollection <- GOCollection(myIds) fl <- system.file("extdata", "goslim_plant.obo", package="GSEABase") slim <- getOBOCollection(fl) goSlim(myCollection, slim, "MF") data(sample.ExpressionSet) goSlim(sample.ExpressionSet, slim, "MF", evidenceCode="TAS")

Link to this function

incidence_methods()

Methods for Constructing Incidence Matricies Between GeneSets

Description

An incidence matrix summarizes shared membership of gene identifiers across (pairs of) gene sets.

Examples

fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gss <- getBroadSets(fl) # GeneSetCollection of 2 sets
## From one or more GeneSetCollections...
imat <- incidence(gss)
dim(imat)
imat[,c(1:3,ncol(imat)-3+1:3)]

## .. or GeneSets
imat1 <- incidence(gss[[1]], gss[[2]], gss[[1]])
imat1[,1:5]
Link to this function

mapIdentifiers_methods()

Methods for Function mapIdentifiers in Package GSEABase' ## Description These methods convert the genes identifiers of a gene set from one type to another, e.g., from [EntrezIdentifier](#entrezidentifier) to [AnnotationIdentifier](#annotationidentifier) . Methods can be called directly by the user; [geneIdType<-](#geneidtype<-) provides similar functionality.verbose=TRUE` produces warning messages when maps between identifier types are not 1:1, or a map has to be constructed on the fly (this situation does not apply when using the DBI-based annotation packages).