[][src]Module debruijn::msp

Methods for minimum substring partitioning of a DNA string

simple_scan method is based on: Li, Yang. "MSPKmerCounter: a fast and memory efficient approach for k-mer counting." arXiv preprint arXiv:1505.06550 (2015).

Structs

MspInterval

Functions

msp_sequence
simple_scan

Determine MSP substrings of seq, for given k and p. Returns a vector of tuples indicating the substrings, and the pmer values: (p-mer value, min p-mer position, start position, end position) permutation is a permutation of the lexicographically-sorted set of all pmers. A permutation of pmers sorted by their inverse frequency in the dataset will give the most even bucketing of MSPs over pmers.