This explanation is going to be a bit wild.
Basically, my entire motivation for this and the previous work with zip
, map
and so forth was to organize safe operations in a way conducive to compiler optimizations, specifically auto-vectorization.
Unfortunately, this only seems to work on slices with a known size at compile-time. I guess because they are an intrinsic type. Any and all attempts to get a custom iterator to optimize like that has failed, even with unstable features.
Even though they technically worked, I wasn't happy with how the previous work did functional operations, with map
/map_ref
and zip
/zip_ref
. It felt a bit unintuitive. They were also strictly attached to the GenericArray
type, so they were useless with GenericSequence
by itself, like with generics.
So, I've redefined GenericSequence
like this:
pub unsafe trait GenericSequence<T>: Sized + IntoIterator {
type Length: ArrayLength<T>;
type Sequence: GenericSequence<T, Length=Self::Length> + FromIterator<T>;
fn generate<F>(f: F) -> Self::Sequence
where F: Fn(usize) -> T;
}
where Sequence
is defined as Self
for the GenericArray
implementation.
That may seem redundant, but now GenericSequence
is broadly implemented for &'a S
and &'a mut S
, and carries over the same Sequence
.
So:
<&'a S as GenericSequence<T>>::Sequence == <S as GenericSequence<T>>::Sequence
Furthermore, IntoIterator
is now implemented for &'a GenericArray<T, N>
and &'a mut GenericArray<T, N>
, where both of those implementations use slice iterators, and each reference type automatically implements GenericSequence<T>
Next, I've added a new trait called MappedGenericSequence
, which looks like:
pub unsafe trait MappedGenericSequence<T, U>: GenericSequence<T>
where
Self::Length: ArrayLength<U>,
{
type Mapped: GenericSequence<U, Length=Self::Length>;
}
and the implementation of that for GenericArray
is just:
unsafe impl<T, U, N> MappedGenericSequence<T, U> for GenericArray<T, N>
where
N: ArrayLength<T> + ArrayLength<U>,
GenericArray<U, N>: GenericSequence<U, Length=N>,
{
type Mapped = GenericArray<U, N>;
}
As you can see, it just defines another arbitrary GenericArray
with the same length. The transformation allows for proving one GenericArray
can be created from another, which leads into the FunctionalSequence
trait.
You can see the default implementation for it in src/functional.rs
, which uses the fact that any GenericSequence
is IntoIterator
and the associated Sequence
is FromIterator
to map/zip sequences using only simple iterators.
FunctionalSequence
is also automatically implemented for &'a S
and &'a mut S
where S: GenericSequence<T>
, so they automatically work with &GenericArray
as well.
Furthermore, it's implemented directly on GenericArray
as well, which uses the ArrayConsumer
system to provide a lightweight and optimizable implementation, rather than relying on GenericArrayIter
, which cannot be optimized.
As a result, code like in the assembly test:
let a = black_box(arr![i32; 1, 3, 5, 7]);
let b = black_box(arr![i32; 2, 4, 6, 8]);
let c = a.zip(&b, |l: i32, r: &i32| l + r);
assert_eq!(c, arr![i32; 3, 7, 11, 15]);
will correctly be optimized into a single VPADDD instruction, just as desired.
~~The downside of this is that non-reference RHS arguments will kill this optimization, because it will use .into_iter()
and GenericArrayIter
. There really isn't a good way around this currently.~~ I found a good way around this currently.
The upside of all of this is that pass any random GenericSequence
without knowing the length is finally feasible, as shown in tests/std.rs
, and here:
pub fn test_generic<S>(s: S)
where
S: FunctionalSequence<i32>, // `.map`
SequenceItem<S>: Add<i32, Output=i32>, // `+`
S: MappedGenericSequence<i32, i32>, // `i32` -> `i32`
MappedSequence<S, i32, i32>: Debug // println!
{
let a = s.map(|x| x + 1);
println!("{:?}", a);
}
Which still has zero runtime length checking, but we've avoided having to know the length of the sequence. Furthermore, now test_generic
can work for GenericArray
, &GenericArray
and &mut GenericArray
with no problems.
BREAKING CHANGES:
- The implementation of
FromIterator
for GenericArray
now panics when the given iterator doesn't produce enough elements, wherein before it padded it with defaults.
map_ref
and zip_ref
are gone, replaced with the new functional system.
- ~~
map
/zip
can fail to optimize unless used with references.~~ Fixed
- I should note that auto-vectorization only works on numbers anyway, so it's no worse than
vec::IntoIter
in the worst case.
- ~~~
GenericArray
and GenericArrayIter
now implement Send
/Sync
when possible.~~ This was a mistake, fixed in #61
What do you think? Perhaps I should write up some examples for the docs, too?
If I failed to explain anything, made a mistake or could improve on anything, please let me know. I just want to make the best things I can.