biopython v1.71.0 Bio.SeqUtils.ProtParam.ProteinAnalysis
Class containing methods for protein analysis.
The constructor takes two arguments. The first is the protein sequence as a string, which is then converted to a sequence object using the Bio.Seq module. This is done just to make sure the sequence is a protein sequence and not anything else.
The second argument is optional. If set to True, the weight of the amino acids will be calculated using their monoisotopic mass (the weight of the most abundant isotopes for each element), instead of the average molecular mass (the averaged weight of all stable isotopes for each element). If set to false (the default value) or left out, the IUPAC average molecular mass will be used for the calculation.
Link to this section Summary
Functions
Initialize the class
Makes a list of relative weight of the window edges compared to the window center. The weights are linear. it actually generates half a list. For a window of size 9 and edge 0.4 you get a list of [0.4, 0.55, 0.7, 0.85]
Calculate the aromaticity according to Lobry, 1994
Count standard amino acids, returns a dict
Calculate the flexibility according to Vihinen, 1994
Calculate the amino acid content in percentages
Calculate the gravy according to Kyte and Doolittle
Calculate the instability index according to Guruprasad et al 1990
Calculate the isoelectric point
Calculate the molar extinction coefficient
Calculate MW from Protein sequence
Compute a profile by any amino acid scale
Calculate fraction of helix, turn and sheet
Link to this section Functions
Initialize the class.
Makes a list of relative weight of the window edges compared to the window center. The weights are linear. it actually generates half a list. For a window of size 9 and edge 0.4 you get a list of [0.4, 0.55, 0.7, 0.85].
Calculate the aromaticity according to Lobry, 1994.
Calculates the aromaticity value of a protein according to Lobry, 1994. It is simply the relative frequency of Phe+Trp+Tyr.
Count standard amino acids, returns a dict.
Counts the number times each amino acid is in the protein sequence. Returns a dictionary {AminoAcid:Number}.
The return value is cached in self.amino_acids_content. It is not recalculated upon subsequent calls.
Calculate the flexibility according to Vihinen, 1994.
No argument to change window size because parameters are specific for a window=9. The parameters used are optimized for determining the flexibility.
Calculate the amino acid content in percentages.
The same as count_amino_acids only returns the Number in percentage of entire sequence. Returns a dictionary of {AminoAcid:percentage}.
The return value is cached in self.amino_acids_percent.
input is the dictionary self.amino_acids_content. output is a dictionary with amino acids as keys.
Calculate the gravy according to Kyte and Doolittle.
Calculate the instability index according to Guruprasad et al 1990.
Implementation of the method of Guruprasad et al. 1990 to test a protein for stability. Any value above 40 means the protein is unstable (has a short half life).
See: Guruprasad K., Reddy B.V.B., Pandit M.W. Protein Engineering 4:155-161(1990).
Calculate the isoelectric point.
Uses the module IsoelectricPoint to calculate the pI of a protein.
Calculate the molar extinction coefficient.
Calculates the molar extinction coefficient assuming Cysteines- (reduced) and Cystines-residues (Cys-Cys-bond)
Calculate MW from Protein sequence
Compute a profile by any amino acid scale.
An amino acid scale is defined by a numerical value assigned to each type of amino acid. The most frequently used scales are the hydrophobicity or hydrophilicity scales and the secondary structure conformational parameters scales, but many other scales exist which are based on different chemical and physical properties of the amino acids. You can set several parameters that control the computation of a scale profile, such as the window size and the window edge relative weight value.
WindowSize: The window size is the length of the interval to use for the profile computation. For a window size n, we use the i-(n-1)/2 neighboring residues on each side to compute the score for residue i. The score for residue i is the sum of the scaled values for these amino acids, optionally weighted according to their position in the window.
Edge: The central amino acid of the window always has a weight of 1. By default, the amino acids at the remaining window positions have the same weight, but you can make the residue at the center of the window have a larger weight than the others by setting the edge value for the residues at the beginning and end of the interval to a value between 0 and 1. For instance, for Edge=0.4 and a window size of 5 the weights will be: 0.4, 0.7, 1.0, 0.7, 0.4.
The method returns a list of values which can be plotted to view the change along a protein sequence. Many scales exist. Just add your favorites to the ProtParamData modules.
Similar to expasy’s ProtScale: http://www.expasy.org/cgi-bin/protscale.pl
Calculate fraction of helix, turn and sheet.
Returns a list of the fraction of amino acids which tend to be in Helix, Turn or Sheet.
Amino acids in helix: V, I, Y, F, W, L. Amino acids in Turn: N, P, G, S. Amino acids in sheet: E, M, A, L.
Returns a tuple of three floats (Helix, Turn, Sheet).