mazu
A Rust library for building modular, fast and compact indexes over genomic data
Mazu (媽祖)... revered as a tutelary deity of seafarers, including fishermen and sailors...
Disclaimer --- This library is in alpha and is under active development.
Highlights
- Query ready indexes via plug-and-play k-mer-to-unitig and unitig-to-occurrence mappings.
- Load (only) compatibility with pufferfish, deserialize pufferfish indices and work with them in Rust.
- Streaming queries for generic indexes for free with
.as_streaming()
- An easy test-bed for new compression algorithms for unitig-occurrences and k-mer dictionaries.
- No more CMake.
Examples
// Load a pufferfish index from C++ implementation
let p = to_abs_path(YEAST_CHR01_INDEX);
let pi = DenseIndex::deserialize_from_cpp(p).unwrap();
// Extract unitigs and build a SSHash
let unitig_set = pi.as_ref().clone();
let sshash = SSHash::from_unitig_set(unitig_set, 15, 32, WyHashState::default()).unwrap();
// Drop in an SSHash for a new index
let pi = ModIndex::from_parts(
pi.base.clone(),
sshash,
pi.as_u2pos().clone(),
pi.as_refseqs().clone(),
);
// Generic implementations take care of query and validation
pi.validate_self();
// Attach a streaming cache and drive the index.
let driver = pi.as_streaming();
driver.validate_self();