cogent3.core.alphabet.CharAlphabet#
- class CharAlphabet(chars: TStrOrBytes | PySeq[TStrOrBytes], gap: TStrOrBytes | None = None, missing: TStrOrBytes | None = None)#
representing fundamental monomer character sets.
- Attributes:
- gap_char
- gap_index
- missing_char
- missing_index
- moltype
- motif_len
- num_canonical
Methods
array_to_bytes(seq)returns seq as a byte string
as_bytes()returns self as a byte string
convert_seq_array_to(*, alphabet, seq[, ...])converts a numpy array with indices from self to other
count(value, /)Return number of occurrences of value.
from_indices(seq)returns a string from a sequence of indices
from_rich_dict(data)returns an instance from a serialised dictionary
get_kmer_alphabet(k[, include_gap])returns kmer alphabet with words of size k
get_subset(motif_subset[, excluded])Returns a new Alphabet object containing a subset of motifs in self.
index(value[, start, stop])Return first index of value.
is_valid(seq)seq is valid for alphabet
to_indices(seq[, validate])returns a sequence of indices for the characters in seq
to_json()returns a serialisable string
to_rich_dict([for_pickle])returns a serialisable dictionary
with_gap_motif([gap_char, missing_char, ...])returns new monomer alphabet with gap and missing characters added
Notes
Provides methods for efficient conversion between characters and integers from fundamental types of strings, bytes and numpy arrays.