biopython v1.71.0 API Reference

(auto-generated from source code)

Modules

Reading information from Affymetrix CEL files version 3 and 4

Affymetrix parser error

Stores the information in a cel file

Extract information from alignment objects

Represent a position specific score matrix

Calculate summary info about the alignment

Bio.AlignIO support for “clustal” output from CLUSTAL W and other tools

Clustalw alignment iterator

Clustalw alignment writer

Bio.AlignIO support for “emboss” alignment output from EMBOSS tools

Emboss alignment iterator

Emboss alignment writer (WORK IN PROGRESS)

Bio.AlignIO support for “fasta-m10” output from Bill Pearson’s FASTA tools

AlignIO support module (not for general use)

Base class for building MultipleSeqAlignment iterators

Base class for building MultipleSeqAlignment writers

Base class for building MultipleSeqAlignment writers

Bio.AlignIO support for the “maf” multiple alignment format

Index for a MAF file

Accepts a MultipleSeqAlignment object, writes a MAF file

Bio.AlignIO support for “xmfa” output from Mauve/ProgressiveMauve

Mauve xmfa alignment iterator

Mauve/XMFA alignment writer

Bio.AlignIO support for the “nexus” file format

Nexus alignment writer

AlignIO support for “phylip” format from Joe Felsenstein’s PHYLIP tools

Reads a Phylip alignment file returning a MultipleSeqAlignment iterator

Relaxed Phylip format Iterator

Relaxed Phylip format writer

Sequential Phylip format Iterator

Sequential Phylip format Writer

Bio.AlignIO support for “stockholm” format (used in the PFAM database)

Loads a Stockholm file from PFAM into MultipleSeqAlignment objects

Stockholm/PFAM alignment writer

Standard nucleotide and protein alphabets defined by IUPAC

Extended IUPAC DNA alphabet

Extended uppercase IUPAC protein single letter alphabet including X etc

Uppercase IUPAC ambiguous DNA

Uppercase IUPAC ambiguous RNA

Uppercase IUPAC protein single letter alphabet of the 20 standard amino acids

Uppercase IUPAC unambiguous DNA (letters GATC only)

Uppercase IUPAC unambiguous RNA (letters GAUC only)

Reduced alphabets which lump together several amino-acids into one letter

Definitions for interacting with BLAST related applications

Wrapper for the NCBI BLAST+ program blast_formatter

Code for calling standalone BLAST and parsing plain text output (DEPRECATED)

Attempt to catch and diagnose BLAST errors while parsing

Parses BLAST data into a Record.Blast object

Iterates over a file of multiple BLAST results

Error caused by running a low quality sequence through BLAST

Parses BLAST data into a Record.PSIBlast object

Error caused by running a short query sequence through BLAST

Code to invoke the NCBI BLAST server over the internet

Code to work with the BLAST XML output

Parse XML BLAST data into a Record.Blast object

A parser for the NCBI blastpgp version 2.2.5 output format. Currently only supports the ‘-m 9’ option, (table w/ annotations). Returns a BlastTableRec instance

Record classes to hold BLAST output

Stores information about one hit in the alignments section

Saves the results from a blast search

Holds information about a database report

Stores information about one hit in the descriptions section

Stores information about one hsp in an alignment hit

Saves information from a blast header

Holds information about a multiple alignment

Saves the results from a blastpgp search

Holds information about the parameters

Holds information from a PSI-BLAST round

Codon tables based on those from the NCBI

A codon-table, or genetic code

Initialize the class

Information about the IUPAC alphabets

Additional protein alphabets used in the SCOP database and PDB files

Bio.DocSQL: easy access to DB API databases (DEPRECATED)

Initialize the class

Initialize the class

Initialize the class

SHOW TABLES

Initialize the class

Initialize the class

Initialize the class

Initialize the class

Code to interact with and run various EMBOSS programs

Commandline object for the diffseq program from EMBOSS

Commandline object for the einverted program from EMBOSS

Commandline object for the etandem program from EMBOSS

Commandline object for the est2genome program from EMBOSS

Commandline object for the fconsense program from EMBOSS

Commandline object for the fdnadist program from EMBOSS

Commandline object for the fdnapars program from EMBOSS

Commandline object for the fneighbor program from EMBOSS

Commandline object for the fprotdist program from EMBOSS

Commandline object for the fdnapars program from EMBOSS

Commandline object for the fseqboot program from EMBOSS

Commandline object for the ftreedist program from EMBOSS

Commandline object for the fuzznuc program from EMBOSS

Commandline object for the needle program from EMBOSS

Commandline object for the needleall program from EMBOSS

Commandline object for the palindrome program from EMBOSS

Commandline object for the primersearch program from EMBOSS

Commandline object for the seqmatchall program from EMBOSS

Commandline object for the seqret program from EMBOSS

Commandline object for the stretcher program from EMBOSS

Commandline object for the tranalign program from EMBOSS

Commandline object for the water program from EMBOSS

Code to parse output from the EMBOSS eprimer3 program

A primer set designed by Primer3

Represent information from a primer3 run finding primers

Code to interact with the primersearch program from EMBOSS

Represent a single amplification from a primer

Represent the input file into the primersearch program

Represent the information from a primersearch job

Parser for XML results returned by NCBI’s Entrez Utilities

Initialize the class

Initialize the class

Initialize the class

XML tag found which was not defined in the DTD

Parse the enzyme.dat file from Enzyme at ExPASy

Holds information from an ExPASy ENZYME record as a Python dictionary

Code to work with the prosite.doc file from Prosite

Holds information from a Prodoc record

Holds information from a Prodoc citation

Parser for the prosite dat file from Prosite at ExPASy

Holds information from a Prosite record

Execute a ScanProsite search

Initialize the class

Represents search results returned by ScanProsite

Several routines used to extract information from FSSP sections

Initialize the class

A Python handle that adds functionality for saving lines

General functionality for crossover that doesn’t apply

Perform crossovers, but do not allow decreases in organism fitness

Generalized N-Point Crossover

Perform n-point crossover between genomes at some defined rates

Demonstration class for Interleaving crossover

Helper class for Two Point crossovers

Perform two-point crossovers between the genomes of two organisms

Perform point crossover between genomes at some defined rate

Perform two-point crossovers between the genomes of two organisms

Perform two point crossover between genomes at some defined rate

Perform uniform crossovers between the genomes of two organisms

Perform single point crossover between genomes at some defined rates

Evolution Strategies for a Population

Evolve a population from generation to generation

Evolve a population in place

General functionality for mutations

Perform mutations, but do not allow decreases in organism fitness

Perform Simple mutations on an organism’s genome

Potentially mutate any item to another in the alphabet

Perform a conversion mutation, but only at a single point in the genome

Deal with an Organism in a Genetic Algorithm population

Represent a single individual in a population

Methods for performing repairs that will Stabilize genomes

Perform repair to reduce the number of Ambiguous genes in a genome

Base selection class from which all Selectors should derive

Base class for Selector classes

Select individuals into a new population trying to maintain diversity

Implement diversity selection

Implement Roulette Wheel selection on a population

Roulette wheel selection proportional to individuals fitness

Provide Tournament style selection

Implement tournament style selection

Hold GenBank data in a straightforward format

Hold information about a Feature in the Feature Table of GenBank record

Hold information about a qualifier in a GenBank feature

Hold GenBank information in a format similar to the original record

Hold information from a GenBank reference

Internal code for parsing GenBank and EMBL files (PRIVATE)

For extracting chunks of information in EMBL files

Hold GEO data in a straightforward format

Hold GEO information in a format similar to the original record

Draw representations of organism chromosomes with added information

Class for drawing a chromosome of an organism

Draw a segment of a chromosome

Top level class for drawing chromosomes

A segment that is located at the end of a linear chromosome

A segment that is located at the end of a linear chromosome

Generate RGB colours suitable for distinguishing categorical data

Implement a spiral path through HSV colour space

Plots to compare information between different sources

Display a scatter-type plot comparing two different kinds of info

Represent information for graphical display

Represent a chromosome with count information

Display information distributed across a Chromosome-like object

Display the distribution of values as a bunch of bars

Display a grouping of distributions on a page

Display the distribution of values as connected lines

Classes and functions to visualise a KGML Pathway Map

Reportlab Canvas-based representation of a KGML pathway map

Dynamic Programming algorithms for general usage

An abstract class to calculate forward and backward probabilities

Implement forward and backward algorithms using a log approach

Implement forward and backward algorithms using a rescaling approach

Deal with representations of Markov Models

Represent a hidden markov model that can be used for state estimation

Interface to build up a Markov Model

Provide trainers which estimate parameters based on training sequences

Provide generic functionality needed in all trainers

Trainer that uses the Baum-Welch algorithm to estimate parameters

Estimate probabilities with known state sequences

Hold a training sequence with emissions and optionally, a state path

Generic functions which are useful for working with HMMs

Index.py

KD tree data structure for searching N-dimensional vectors

KD tree implementation in C++, SWIG python wrapper

Classes and functions to parse a KGML pathway map

Parses a KGML XML Pathway entry into a Pathway object

Classes to represent a KGML Pathway Map

An Entry subelement used to represents a complex node

Represent an Entry from KGML

An Entry subelement used to represents the visual representation

Represents a KGML pathway from KEGG

A specific chemical reaction with substrates and products

A relationship between to products, KOs, or protein and compound

Provides code to access the REST-style KEGG online API

Code for doing logistic regressions

Holds information necessary to do logistic regression classification

A state-emitting MarkovModel

Create a state-emitting MarkovModel object

NOEtools: For predicting NOE coordinates from assignment data

General Naive Bayes learner

Hold information for a NaiveBayes classifier

Model a single layer in a nueral network

Abstract base class for all layers

Represent Neural Networks

Represent a Basic Neural Network with three layers

Find and deal with motifs in biological sequence data

Convert motifs and a sequence into neural network representations

Find motifs in a set of Sequence Records

Generic functionality useful for all gene representations

Allow reading and writing of patterns to files

Hold a list of specific patterns found in sequences

Deal with Motifs or Signatures allowing ambiguity in the sequences

Calculate fitness for schemas that differentiate between sequences

Find schemas using a genetic algorithm approach

Calculate a fitness giving weight to schemas that match many times

Generate a random motif within given parameters

Deal with motifs that have ambiguity characters in it

Convert a sequence into a representation of ambiguous motifs (schemas)

Alphabet of a simple Schema for DNA sequences

Generate Schema from inputs of Motifs or Signatures

Find schema in a set of sequences using a genetic algorithm approach

Determine when we are done evolving motifs

Find and deal with signatures in biological sequence data

Convert a Sequence into its signature representatives

Find Signatures in a group of sequence records

Classes to help deal with stopping training a neural network

Class to stop training on a network when the validation error increases

Provide classes for dealing with Training Neural Networks

Manage a grouping of Training Examples

Hold inputs and outputs of a training example

Nexus class. Parse the contents of a NEXUS file

Represent a NEXUS block with block name and list of commandlines

Helps reading NEXUS-words and characters from a buffer (semi-PRIVATE)

Represent a commandline as command and options

Initialize the class

Calculate a stepmatrix for weighted parsimony

Linked list functionality for use in Bio.Nexus

Stores a list of nodes that are linked together

A single node

Objects to represent NEXUS standard data type matrix coding

Create a StandardData iterable object

Tree class to handle phylogenetic trees

Store tree-relevant data associated with nodes (e.g. branches or otus)

Represent a tree using a chain of nodes with on predecessor (=ancestor) and multiple successors (=subclades)

Class that maps (chain_id, residue_id) to a residue property

Atom class, used in Structure objects

Create Atom object

Contains all Atom objects that represent the same disordered atom

Chain class, used in Structure objects

Initialize the class

Run DSSP and parse secondary structure and accessibility

Code for chopping up (dicing) a structure

Only accepts residues with right chainid, between start and end

Base class for Residue, Chain, Model and Structure classes

Wrapper class to group equivalent Entities

Basic container object for PDB heirachy

Classify protein backbone structure according to Kolodny et al’s fragment libraries

Represent a polypeptide C-alpha fragment

Map polypeptides in a model to lists of representative fragments

Half-sphere exposure and coordination number calculation

Residue exposure as number of CA atoms around its CA atom

Class to calculate HSE based on the approximate CA-CB vectors

Class to calculate HSE based on the real CA-CB vectors

Turn an mmCIF file into a dictionary

Parse a mmCIF file and return a dictionary

mmCIF parsers

Parse an MMCIF file and return a Structure object

Parse a mmCIF file and return a Structure object

Model class, used in Structure objects

The object representing a model in a structure. In a structure derived from an X-ray crystallography experiment, only a single model will be present (with some exceptions). NMR structures normally contain many different models

Interface for the program NACCESS

Initialize the class

Fast atom neighbor lookup using a KD tree (implemented in C++)

Class for neighbor searching,

Some Bio.PDB-specific exceptions

Output of PDB files

Write a Structure object (or a subset of a Structure object) as a PDB file

Select everything for PDB output (for use as a base class)

Access the PDB over the internet (e.g. to download structures)

Parser for PDB files

Parse a PDB file and return a Structure object

Wrappers for PSEA, a program for secondary structure assignment

Initialize the class

Polypeptide-related classes (construction and representation)

Use CA—CA distance to find polypeptides

Use C—N distance to find polypeptides

A polypeptide is simply a list of L{Residue} objects

Residue class, used by Structure objects

DisorderedResidue is a wrapper around two or more Residue objects

Represents a residue. A Residue object stores atoms

Calculation of residue depth using command line tool MSMS

Calculate residue and CA depth for all residues

Selection of atoms, residues, etc

The structure class, representing a macromolecular structure

The Structure class contains a collection of Model instances

Map residues of two structures to each other based on a FASTA alignment

Class to align two structures based on an alignment of their sequences

Consumer class that builds a Structure object

Deals with contructing the Structure object

Superimpose two structures

Rotate/translate one set of atoms on top of another, thereby minimizing the RMSD

Vector class, including rotation-related functions

Code to support writing parsers (DEPRECATED)

Base class for other Consumers

Base class for other parsers

Debugging consumer which tags data with the event and logs it

A directed graph abstraction with labeled edges

Depth first search of g

A directed multigraph abstraction with labeled edges

Base classes for Bio.Phylo objects

Indicates the color of a clade when rendered graphically

A recursively defined sub-tree

A phylogenetic tree, containing global info for the phylogeny

Base class for all Bio.Phylo classes

Methods for Tree- and Clade-based classes

Classes corresponding to CDAO trees

CDAO Clade (sub-tree) object

CDAO Tree object

I/O function wrappers for the RDF/CDAO file format

Exception raised when CDAO object construction cannot continue (DEPRECATED)

Parse a CDAO tree given a file handle

Based on the writer in Bio.Nexus.Trees (str, to_string)

Classes and methods for finding consensus trees

Classes corresponding to NeXML trees

NeXML Clade (sub-tree) object

NeXML Tree object

I/O function wrappers for the NeXML file format

Exception raised when NeXML object construction cannot continue

Parse a NeXML tree given a file handle

Based on the writer in Bio.Nexus.Trees (str, to_string)

Classes corresponding to Newick trees, also used for Nexus trees

Newick Clade (sub-tree) object

Newick Tree object

I/O function wrappers for the Newick file format

Exception raised when Newick object construction cannot continue

Parse a Newick tree given a file handle

Based on the writer in Bio.Nexus.Trees (str, to_string)

Classes corresponding to phyloXML elements

Captures the local part in a sequence identifier

The annotation of a molecular sequence

Binary characters at the root of a clade

Initialize parameters for the BranchColor object

Describes a branch of the current phylogenetic tree

Expresses a typed relationship between two clades

A general purpose confidence element

A date associated with a clade/node

Geographic distribution of the items of a clade (species, sequences)

Domain architecture of a protein

Events at the root node of a clade (e.g. one gene duplication)

A general-purpose identifier element

Store a molecular sequence

Container for non-phyloXML elements in the tree

Base class for all PhyloXML objects

Warning for non-compliance with the phyloXML specification

A phylogenetic tree

Root node of the PhyloXML document

Geographic coordinates of a point, with an optional altitude

A polygon defined by a list of ‘Points’ (used by element ‘Distribution’)

A typed and referenced property from an external resources

Represents an individual domain in a domain architecture

Literature reference for a clade

A molecular sequence (Protein, DNA, RNA) associated with a node

Express a typed relationship between two sequences

Describe taxonomic information for a clade

PhyloXML reader/parser, writer, and associated functions

Methods for parsing all phyloXML nodes from an XML stream

Exception raised when PhyloXML object construction cannot continue

Methods for serializing a PhyloXML object to XML

Classes and methods for tree construction

Class to calculate the distance matrix from a DNA or Protein

Distance matrix class that can be used for distance based tree algorithms

Distance based tree constructor

Tree searching with Nearest Neighbor Interchanges (NNI) algorithm

Parsimony scorer with a scoring matrix

Base class for all tree scoring methods

Base class for all tree constructor

Base class for all tree searching methods

Module to control GenePop

Control GenePop through an easier interface

Code to parse BIG GenePop files

Holds information from a GenePop record

Large file parsing of Genepop files

Holds information from a GenePop record

Utility functions to deal with GenePop files

PrintFormat allow the printing of results of restriction analysis

Configuration of the console

Restriction Enzyme classes

Implement the methods that are common to all restriction enzymes

Implement methods for enzymes that produce variable overhangs

Provide methods for enhanced analysis and pretty printing

Implement methods for enzymes that produce blunt ends

Implement methods for enzymes which are commercially available

Implement methods for enzymes with defined recognition site and cut

FormattedSeq(seq, [linear=True])-> new FormattedSeq

Implement the information about methylation

Implement information about methylation sensitibility

Implement the methods specific to the enzymes that do not cut

Implement methods for enzymes with non-palindromic recognition sites

Implement methods for enzymes with non-characterized overhangs

Implement methods for enzymes which are not commercially available

Implement the methods for enzymes that cut the DNA only once

Implement methods for enzymes that produce 3’ overhanging ends

Implement methods for enzymes that produce 5’ overhanging ends

Implement methods for enzymes with palindromic recognition sites

Class for operations on more than one enzyme

RestrictionType. Type from which all enzyme classes are derived

Implement the methods for enzymes that cut the DNA twice

Implement methods for enzymes that produce unknown overhangs

Handle the SCOP CLAssification file, which describes SCOP domains

A CLA file indexed by SCOP identifiers for rapid random access

Holds information for one SCOP domain

Handle the SCOP DEScription file

Holds information for one node in the SCOP hierarchy

Handle the SCOP DOMain file

Holds information for one SCOP domain

Handle the SCOP HIErarchy files, which describe the SCOP hierarchy in terms of SCOP unique identifiers (sunid)

Holds information for one node in the SCOP hierarchy

ASTRAL RAF (Rapid Access Format) Sequence Maps

A single residue mapping from a RAF record

An ASTRAL RAF (Rapid Access Format) Sequence Map

An RAF file index

A collection of residues from a PDB structure

A collection of residues from a PDB structure

Bio.SearchIO parser for BLAT output formats

Indexer class for BLAT PSL output

Parser for the BLAT PSL format

Writer for the blat-psl format

Indexer class for Bill Pearson’s FASTA suite’s -m 10 output

Parser for Bill Pearson’s FASTA suite’s -m 10 output

Provide objects to represent biological sequences with alphabets

An editable sequence object (with an alphabet)

Read-only sequence object (essentially a string with an alphabet)

Read-only sequence object of known length but unknown contents

Represent a Sequence Feature holding info about a part of a sequence

Abstract base class representing a position

Specify a position where the actual location is found after it

Specify a position where the actual location occurs before it

Specify the position of a boundary between two coordinates (OBSOLETE?)

Specify the specific position of a boundary

Specify the location of a feature along a sequence

Specify a position where the location can be multiple positions

Simple class to hold information about a gap between positions

Represent a Generic Reference object

Specify a specific position which is uncertain

Specify a specific position which is unknown (has no position)

Specify the position of a boundary within some coordinates

Bio.SeqIO parser for the ABI format

Bio.SeqIO support for the “ace” file format

Bio.SeqIO support for the “fasta” (aka FastA or Pearson) file format

Class to write Fasta format files

Bio.SeqIO support for the “ig” (IntelliGenetics or MASE) file format

Bio.SeqIO support for the “genbank” and “embl” file formats

IMGT writer (EMBL format variant)

Bio.SeqIO support module (not for general use)

Base class for building SeqRecord iterators

Base class for building SeqRecord writers

Base class for sequence writers. This class should be subclassed

Returns SeqRecord objects for each chain in a PDB file

Bio.SeqIO support for the “phd” file format

Class to write Phd format files

Bio.SeqIO support for the “pir” (aka PIR or NBRF) file format

Class to write PIR format files

Class to write standard FASTQ format files (using PHRED quality scores)

Class to write QUAL format files (using PHRED quality scores)

Bio.SeqIO support for the “seqxml” file format, SeqXML

Breaks seqXML file into SeqRecords

Writes SeqRecords into seqXML file

Base class for building iterators for record style XML formats

Bio.SeqIO support for the binary Standard Flowgram Format (SFF) file format

SFF file writer

Bio.SeqIO support for the “swiss” (aka SwissProt/UniProt) file format

Bio.SeqIO support for the “tab” (simple tab separated) file format

Class to write simple tab separated format files

Bio.SeqIO support for the “uniprot-xml” file format

Parse a UniProt XML entry to a SeqRecord

Represent a Sequence Record, a sequence with annotation

Functions to calculate assorted sequence checksums

A codon adaptation index (CAI) implementation

Codon adaption indxes, including Sharp and Li (1987) E. coli index

Calculate isoelectric points of polypeptides using methods of Bjellqvist

Calculate the melting temperature of nucleotide sequences

Simple protein analysis

Class containing methods for protein analysis

Indices to be used with ProtParam

Holds data of an ACE file

Holds information about a read supporting an ACE contig

Parser for PHD files output by PHRED and used by PHRAP and CONSED

Hold information from a PHD file

A class to handle frequency tables or letter count files

Initialize the class

Substitution matrices for use in alignments, etc

Code to parse the keywlist.txt file from SwissProt/UniProt

Store information of one keyword or category from the keywords list

Parsers for the GAF, GPA and GPI formats from UniProt-GOA

These documentations for Biopython are extracted from the source code