A more convenient `#[target_feature]` replacement

Overview

A more convenient #[target_feature] replacement

To get good performance out of SIMD everything on the SIMD codepath must be inlined. With how SIMD is currently implemented in Rust one of two things have to be true for a function using SIMD to be inlinable: (and this includes the SIMD intrinsics themselves)

a) The whole program has to be compiled with the relevant -C target-cpu or -C target-feature flags.

b) SIMD support must be automatically detected at runtime, and every function on the SIMD codepath must be marked with #[target_feature].

Both have their downsides. Setting the target-cpu or target-features makes the resulting binary incompatible with older CPUs, while using #[target_feature] is incredibly inconvenient.

This crate is meant to make #[target_feature] less painful to use.

Problems with #[target_feature]

When we're not compiling with the relevant target-cpu/target-feature flags everything on the SIMD codepath must be marked with the #[target_feature] attribute. This is not a problem when all of your SIMD code is neatly encapsulated inside of a single function, but once you start to build out more elaborate abstractions it starts to become painful to use.

  • It can only be used on unsafe functions, so everything on your SIMD codepath now has to be unsafe.

    In theory this is nice - these functions require the relevant SIMD instructions to be present at runtime, so calling them without checking is obviously unsafe! But in practice this is rarely what you want. When you build an abstraction over SIMD code you usually want to assume that internally within your module all of the necessary SIMD instructions are available, and you only want to check this at the boundaries when you're first entering your module. You do not want to infect everything inside of the module with unsafe since you've already checked this invariant at the module's API boundary.

  • It cannot be used on non-unsafe trait methods.

    If you're implementing a trait, say for example std::ops::Add, then you cannot mark the method unsafe unless the original trait also has it marked as unsafe, and usually it doesn't.

  • It makes it impossible to abstract over a given SIMD instruction set using a trait.

    For example, let's assume you want to abstract over which SIMD instructions you use using a trait in the following way:

    trait Backend {
        unsafe fn sum(input: &[u32]) -> u32;
    }
    
    struct AVX;
    impl Backend for AVX {
        #[target_feature(enable = "avx")]
        unsafe fn sum(xs: &[u32]) -> u32 {
            // ...
            todo!();
        }
    }
    
    struct AVX2;
    impl Backend for AVX2 {
        #[target_feature(enable = "avx2")]
        unsafe fn sum(xs: &[u32]) -> u32 {
            // ...
            todo!();
        }
    }
    
    // And now you want a have function which calls into that trait:
    unsafe fn do_calculations<B>(xs: &[u32]) -> u32 where B: Backend {
        let value = B::sum(xs);
        // ...do some more calculations here...
        value
    }

    We have a problem here. This has to be marked with #[target_feature], and that has to specify the concrete feature flag for a given SIMD instruction set, but this function is generic so we can't do that!

How does this crate make it better?

You can now mark safe functions with #[target_feature]

This crate exposes an #[unsafe_target_feature] macro which works just like #[target_feature] except it moves the unsafe from the function prototype into the macro name, and can be used on safe functions.

// ERROR: `#[target_feature(..)]` can only be applied to `unsafe` functions
#[target_feature(enable = "avx2")]
fn func() {}
// It works, but must be `unsafe`
#[target_feature(enable = "avx2")]
unsafe fn func() {}
use unsafe_target_feature::unsafe_target_feature;

// No `unsafe` on the function itself!
#[unsafe_target_feature("avx2")]
fn func() {}

It can also be used to mark functions inside of impls:

struct S;

impl core::ops::Add for S {
    type Output = S;
    // ERROR: method `add` has an incompatible type for trait
    #[target_feature(enable = "avx2")]
    unsafe fn add(self, rhs: S) -> S {
        S
    }
}
use unsafe_target_feature::unsafe_target_feature;

struct S;

#[unsafe_target_feature("avx2")]
impl core::ops::Add for S {
    type Output = S;
    // No `unsafe` on the function itself!
    fn add(self, rhs: S) -> S {
        S
    }
}

You can generate specialized copies of a module for each target feature

use unsafe_target_feature::unsafe_target_feature_specialize;

#[unsafe_target_feature_specialize("sse2", "avx2", conditional("avx512ifma", nightly))]
mod simd {
    #[for_target_feature("sse2")]
    pub const CONSTANT: u32 = 1;

    #[for_target_feature("avx2")]
    pub const CONSTANT: u32 = 2;

    #[for_target_feature("avx512ifma")]
    pub const CONSTANT: u32 = 3;

    pub fn func() { /* ... */ }
}

fn entry_point() {
    #[cfg(nightly)]
    if std::is_x86_feature_detected!("avx512ifma") {
        return simd_avx512ifma::func();
    }

    if std::is_x86_feature_detected!("avx2") {
        return simd_avx2::func();
    }

    if std::is_x86_feature_detected!("sse2") {
        return simd_sse2::func();
    }

    unimplemented!();
}

How to use #[unsafe_target_feature]?

  • Can be used on fns, impls and mods.
  • When used on a function will only apply to that function; it won't apply to any nested functions, traits, mods, etc.
  • When used on an impl will only apply to all of the functions directly defined inside of that impl.
  • When used on a mod will only apply to all of the fns and impls directly defined inside of that mod.
  • Cannot be used on methods which use self or Self; instead use it on the impl in which the method is defined.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

You might also like...
procs is a replacement for ps written in Rust.
procs is a replacement for ps written in Rust.

procs is a replacement for ps written in Rust. Documentation quick links Features Platform Installation Usage Configuration Features Output by t

fastmod is a fast partial replacement for the codemod tool

fastmod is a fast partial replacement for codemod. Like codemod, it is a tool to assist you with large-scale codebase refactors, and it supports most of codemod's options.

A drop-in replacement for `dapp` and `seth` in Rust

dapptools.rs Rust port of DappTools dapp example Usage Run Solidity tests Any contract that contains a function starting with test is being tested. Th

A readline replacement written in Rust

A readline replacement written in Rust Basic example // Create a default reedline object to handle user input use reedline::{DefaultPrompt, Reedline,

Fls - Ferris-LS, a very bad LS replacement. Besides that, proves that I suck at clean & good code.

FLS A handy ls remake, purely for learning. Why FLS? There's no reason, at all, don't use it. I just want to learn Rust :D Usage Flags: -i = Use icons

๐Ÿฆ€๏ธatos for linux by rust - A partial replacement for Apple's atos tool for converting addresses within a binary file to symbols.

atosl-rs ๐Ÿฆ€๏ธ atos for linux by rust - A partial replacement for Apple's atos tool for converting addresses within a binary file to symbols. tested on

xcp is a (partial) clone of the Unix cp command. It is not intended as a full replacement

xcp is a (partial) clone of the Unix cp command. It is not intended as a full replacement, but as a companion utility with some more user-friendly feedback and some optimisations that make sense under certain tasks (see below).

drop-in replacement for libfuzzer
drop-in replacement for libfuzzer

Fazi A reimplementation of libfuzzer in Rust with some improvements Supported Features libFuzzer's mutations SanCov feedback Building without a main()

Xcode Neovim Replacement-ish.
Xcode Neovim Replacement-ish.

An XCode replacement-ish development environment that aims to be your reliable XCode alternative to develop exciting new [apple] software products ๐Ÿš€

Owner
Koute
KSM: HTk3eccL7WBkiyxz1gBcqQRghsJigoDMD7mnQaz1UAbMpQV
Koute
A drop-in replacement for std::time::Instant that is faster and more accurate.

Fastant A drop-in replacement for std::time::Instant that measures time with high performance and high accuracy powered by Time Stamp Counter (TSC). U

FastLabs 8 Sep 15, 2024
Irx-config - The library provides convenient way to represent/parse configuration from different sources

The irx-config library provides convenient way to represent/parse configuration from different sources. The main goals is to be very easy to use and t

Andriy Bakay 2 Sep 14, 2022
Simple but convenient CLI-based Matrix client app for sending and receiving (in Rust)

matrix-commander-rs simple but convenient CLI-based Matrix client app for sending and receiving Help create this Rust program! This project is current

null 19 Dec 30, 2022
simple but convenient CLI-based Nostr client app for following users and sending DMs

nostr-commander-rs TLDR: simple but convenient CLI-based Nostr client app for publishing, sending DMs, as well as following users and channels nostr-c

null 18 Dec 30, 2022
`matchable` provides a convenient enum for checking if a piece of text is matching a string or a regex.

matchable matchable provides a convenient enum for checking if a piece of text is matching a string or a regex. The common usage of this crate is used

Pig Fang 6 Dec 19, 2022
A convenient macro for building PathBufs in Rust.

pathbuf pathbuf is a simple crate which provides the pathbuf macro to conveniently construct the Rust PathBuf type. Example use pathbuf::pathbuf; use

Andrew Lilley Brinker 6 Jan 16, 2023
A simple yet convenient cross-platform ARP spoofer

ruuf - A simple yet convenient cross-platform ARP spoofer Poison the ARP cache of the given victim, thereby redirecting the traffic to the target ther

null 6 Feb 19, 2023
This library provides a convenient derive macro for the standard library's std::error::Error trait.

derive(Error) This library provides a convenient derive macro for the standard library's std::error::Error trait. [dependencies] therror = "1.0" Compi

Sebastian Thiel 5 Oct 23, 2023
exa is a modern replacement for ls.

exa exa is a modern replacement for ls. README Sections: Options โ€” Installation โ€” Development exa is a modern replacement for the venerable file-listi

Benjamin Sago 20.3k Jan 8, 2023
zoxide is a blazing fast replacement for your cd command

zoxide A smarter cd command for your terminal zoxide is a blazing fast replacement for your cd command, inspired by z and z.lua. It keeps track of the

Ajeet D'Souza 8.7k Dec 31, 2022