biopython v1.71.0 Bio.UniProt.GOA

Parsers for the GAF, GPA and GPI formats from UniProt-GOA.

Uniprot-GOA README + GAF format description: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/README

GAF formats: http://www.geneontology.org/GO.format.annotation.shtml gp_association (GPA format) README: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/gp_association_readme

gp_information (GPI format) README: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/gp_information_readme

Link to this section Summary

Functions

Read GPA 1.0 format files (PRIVATE)

Read GPA 1.1 format files (PRIVATE)

Read GPI 1.0 format files (PRIVATE)

Read GPI 1.0 format files (PRIVATE)

Iterates over records in a gene association file

Iterate over a GAF 1.0 or 2.0 file

Wrapper function: read GPA format files

Read GPI format files

Accepts a record, and a dictionary of field values

Write a list of GAF records to an output stream

Write a single UniProt-GOA record to an output stream

Link to this section Functions

Link to this function _gpa10iterator()

Read GPA 1.0 format files (PRIVATE).

This iterator is used to read a gp_association.* file which is in the GPA 1.0 format. Do not call directly. Rather, use the gpaiterator function.

Link to this function _gpa11iterator()

Read GPA 1.1 format files (PRIVATE).

This iterator is used to read a gp_association.goa_uniprot file which is in the GPA 1.1 format. Do not call directly. Rather use the gpa_iterator function

Link to this function _gpi10iterator()

Read GPI 1.0 format files (PRIVATE).

This iterator is used to read a gp_information.goa_uniprot file which is in the GPI 1.0 format.

Link to this function _gpi11iterator()

Read GPI 1.0 format files (PRIVATE).

This iterator is used to read a gp_information.goa_uniprot file which is in the GPI 1.0 format.

Link to this function gafbyproteiniterator()

Iterates over records in a gene association file.

Returns a list of all consecutive records with the same DB_Object_ID This function should be called to read a gene_association.goa_uniprot file. Reads the first record and returns a gaf 2.0 or a gaf 1.0 iterator as needed 2016-04-09: added GAF 2.1 iterator & fixed bug in iterator assignment In the meantime GAF 2.1 uses the GAF 2.0 iterator

Iterate over a GAF 1.0 or 2.0 file.

This function should be called to read a gene_association.goa_uniprot file. Reads the first record and returns a gaf 2.0 or a gaf 1.0 iterator as needed

Example: open, read, interat and filter results.

Original data file has been trimed to ~600 rows.

Original source ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/YEAST/goa_yeast.gaf.gz

 >>> from Bio.UniProt.GOA import gafiterator, record_has
 >>> Evidence = {'Evidence': set(['ND'])}
 >>> Synonym = {'Synonym': set(['YA19A_YEAST', 'YAL019W-A'])}
 >>> Taxon_ID = {'Taxon_ID': set(['taxon:559292'])}
 >>> with open('UniProt/goa_yeast.gaf', 'r') as handle:
 ...     for rec in gafiterator(handle):
 ...         if record_has(rec, Taxon_ID) and record_has(rec, Evidence) and record_has(rec, Synonym):
 ...             for key in ('DB_Object_Name', 'Evidence', 'Synonym', 'Taxon_ID'):
 ...                 print(rec[key])
 ...
 Putative uncharacterized protein YAL019W-A
 ND
 ['YA19A_YEAST', 'YAL019W-A']
 ['taxon:559292']
 Putative uncharacterized protein YAL019W-A
 ND
 ['YA19A_YEAST', 'YAL019W-A']
 ['taxon:559292']
 Putative uncharacterized protein YAL019W-A
 ND
 ['YA19A_YEAST', 'YAL019W-A']
 ['taxon:559292']

Wrapper function: read GPA format files.

This function should be called to read a gene_association.goa_uniprot file. Reads the first record and returns a gpa 1.1 or a gpa 1.0 iterator as needed

Read GPI format files.

This function should be called to read a gp_information.goa_uniprot file. At the moment, there is only one format, but this may change, so this function is a placeholder a future wrapper.

Accepts a record, and a dictionary of field values.

The format is {‘field_name’: set([val1, val2])}. If any field in the record has a matching value, the function returns True. Otherwise, returns False.

Link to this function writebyproteinrec()

Write a list of GAF records to an output stream.

Caller should know the format version. Default: gaf-2.0 If header has a value, then it is assumed this is the first record, a header is written. Typically the list is the one read by fafbyproteinrec, which contains all consecutive lines with the same DB_Object_ID

Write a single UniProt-GOA record to an output stream.

Caller should know the format version. Default: gaf-2.0 If header has a value, then it is assumed this is the first record, a header is written.