biopython v1.71.0 Bio.GenBank.Record.Record
Hold GenBank information in a format similar to the original record.
The Record class is meant to make data easy to get to when you are just interested in looking at GenBank data.
- locus - The name specified after the LOCUS keyword in the GenBank record. This may be the accession number, or a clone id or something else.
- size - The size of the record.
- residue_type - The type of residues making up the sequence in this record. Normally something like RNA, DNA or PROTEIN, but may be as esoteric as ‘ss-RNA circular’.
- data_file_division - The division this record is stored under in GenBank (ie. PLN -> plants; PRI -> humans, primates; BCT -> bacteria…)
- date - The date of submission of the record, in a form like ‘28-JUL-1998’
- accession - list of all accession numbers for the sequence.
- nid - Nucleotide identifier number.
- pid - Proteint identifier number
- version - The accession number + version (ie. AB01234.2)
- db_source - Information about the database the record came from
- gi - The NCBI gi identifier for the record.
- keywords - A list of keywords related to the record.
- segment - If the record is one of a series, this is info about which segment this record is (something like ‘1 of 6’).
- source - The source of material where the sequence came from.
- organism - The genus and species of the organism (ie. ‘Homo sapiens’)
- taxonomy - A listing of the taxonomic classification of the organism, starting general and getting more specific.
- references - A list of Reference objects.
- comment - Text with any kind of comment about the record.
- features - A listing of Features making up the feature table.
- base_counts - A string with the counts of bases for the sequence.
- origin - A string specifying info about the origin of the sequence.
- sequence - A string with the sequence itself.
- contig - A string of location information for a CONTIG in a RefSeq file
- project - The genome sequencing project numbers (will be replaced by the dblink cross-references in 2009).
- dblinks - The genome sequencing project number(s) and other links. (will replace the project information in 2009).
Link to this section Summary
Provide a GenBank formatted output option for a Record
Output for the ACCESSION line
Output for the BASE COUNT line with base information
Output for the COMMENT lines
Output for CONTIG location information from RefSeq
Output for DBSOURCE line
Provide output for the DEFINITION line
Output for the FEATURES line
Output for the KEYWORDS line
Provide the output string for the LOCUS line
Output for the NID line. Use of NID is obsolete in GenBank files
Output for ORGANISM line with taxonomy info
Output for the ORIGIN line
Output for PID line. Presumedly, PID usage is also obsolete
Output for the SEGMENT line
Output for all of the sequence
Output for SOURCE line on where the sample came from
Output for the VERSION line
Link to this section Functions
Provide a GenBank formatted output option for a Record.
The objective of this is to provide an easy way to read in a GenBank record, modify it somehow, and then output it in ‘GenBank format.’ We are striving to make this work so that a parsed Record that is output using this function will look exactly like the original record.
Much of the output is based on format description info at:
Output for the ACCESSION line.
Output for the BASE COUNT line with base information.
Output for the COMMENT lines.
Output for CONTIG location information from RefSeq.
Output for DBSOURCE line.
Provide output for the DEFINITION line.
Output for the FEATURES line.
Output for the KEYWORDS line.
Provide the output string for the LOCUS line.
Output for the NID line. Use of NID is obsolete in GenBank files.
Output for ORGANISM line with taxonomy info.
Output for the ORIGIN line.
Output for PID line. Presumedly, PID usage is also obsolete.
Output for the SEGMENT line.
Output for all of the sequence.
Output for SOURCE line on where the sample came from.
Output for the VERSION line.