In this pull request I introduce a trait Encoder
with 3 method encoding
decoding
and rev_comp
, Kmer
can use this Encoder
method to perform encoding, decoding and reverse complement operation on array
.
I also introduce 2 encoder, the first one Naive
support any encoding type, the counter part is /slower/ operation. Xor10
operation is faster but support only one encoding (A -> 00, C -> 01, T -> 10, G -> 11).
We can add many other encoder, an encoder use smid instruction to convert 32 nucleotide in u64, @Daniel-Liu-c0deb0t create some algorithm to do this here nuc2bits. Usage of interface allow user to create Encoder
fit perfectly to his needs.
In this pull request Kmer
constructor accept u8
slice and an object implement Encoder
, we can move this Encoder
as Kmer
type argument, but the type declaration became a little nightmare:
let kmer: Kmer<15, u16, { word_for_k::<u16, 15>() }, Encoding::Naive::ACTG> = Kmer::new("ACTGAGAGAGACCAT");
This type complexity could by simplify by type aliasing and also when Rust get a more complete constant generic interface.
To simplify code base I also create trait Data
to group all trait must be implemented by type use in array
and add a method to_u8
required by Naive
encoder, this trait and this method could be move or remove.
My implementation could probably be improve in many point, but I think general structure are already nice, intresting and ready to be discuss.
This pull request also containts PR #8