biopython v1.71.0 Bio.NeuralNetwork.Gene.Pattern.PatternRepository

Hold a list of specific patterns found in sequences.

This is designed to be a general holder for a set of patterns and should be subclassed for specific implementations (ie. holding Motifs or Signatures.

Link to this section Summary

Functions

Initialize a repository with patterns,

Return the number of times the specified pattern is found

Retrieve all of the patterns in the repository

Retrieve patterns that are at the extreme ranges

Retrieve the specified number of patterns randomly

Return the specified number of most frequently occurring patterns

Return a percentage of the patterns

Remove patterns which are likely due to polyA tails from the lists

Link to this section Functions

Initialize a repository with patterns,

Arguments:

  • pattern_info - A representation of all of the patterns found in a finder search. This should be a dictionary, where the keys are patterns, and the values are the number of times a pattern is found.

The patterns are represented interally as a list of two tuples, where the first element is the number of times a pattern occurs, and the second is the pattern itself. This makes it easy to sort the list and return the top N patterns.

Return the number of times the specified pattern is found.

Retrieve all of the patterns in the repository.

Link to this function get_differing()

Retrieve patterns that are at the extreme ranges.

This returns both patterns at the top of the list (ie. the same as returned by get_top) and at the bottom of the list. This is especially useful for patterns that are the differences between two sets of patterns.

Arguments:

  • top_num - The number of patterns to take from the top of the list.
  • bottom_num - The number of patterns to take from the bottom of the list.

Retrieve the specified number of patterns randomly.

Randomly selects patterns from the list and returns them.

Arguments:

  • num_patterns - The total number of patterns to return.

Return the specified number of most frequently occurring patterns

Arguments:

  • num_patterns - The number of patterns to return.
Link to this function get_top_percentage()

Return a percentage of the patterns.

This returns the top ‘percent’ percentage of the patterns in the repository.

Remove patterns which are likely due to polyA tails from the lists.

This is just a helper function to remove pattenrs which are likely just due to polyA tails, and thus are not really great motifs. This will also get rid of stuff like ATATAT, which might be a useful motif, so use at your own discretion.

XXX Could we write a more general function, based on info content or something like that?

Arguments:

  • at_percentage - The percentage of A and T residues in a pattern that qualifies it for being removed.