Wasmtime - Standalone JIT-style runtime for WebAssembly, using Cranelift

Overview

wasmtime

A standalone runtime for WebAssembly

A Bytecode Alliance project

build status zulip chat supported rustc stable Documentation Status

Guide | Contributing | Website | Chat

Installation

The Wasmtime CLI can be installed on Linux and macOS with a small install script:

$ curl https://wasmtime.dev/install.sh -sSf | bash

Windows or otherwise interested users can download installers and binaries directly from the GitHub Releases page.

Example

If you've got the Rust compiler installed then you can take some Rust source code:

fn main() {
    println!("Hello, world!");
}

and compile/run it with:

$ rustup target add wasm32-wasi
$ rustc hello.rs --target wasm32-wasi
$ wasmtime hello.wasm
Hello, world!

Features

  • Lightweight. Wasmtime is a standalone runtime for WebAssembly that scales with your needs. It fits on tiny chips as well as makes use of huge servers. Wasmtime can be embedded into almost any application too.

  • Fast. Wasmtime is built on the optimizing Cranelift code generator to quickly generate high-quality machine code at runtime.

  • Configurable. Whether you need to precompile your wasm ahead of time, or interpret it at runtime, Wasmtime has you covered for all your wasm-executing needs.

  • WASI. Wasmtime supports a rich set of APIs for interacting with the host environment through the WASI standard.

  • Standards Compliant. Wasmtime passes the official WebAssembly test suite, implements the official C API of wasm, and implements future proposals to WebAssembly as well. Wasmtime developers are intimately engaged with the WebAssembly standards process all along the way too.

Language Support

You can use Wasmtime from a variety of different languages through embeddings of the implementation:

Documentation

๐Ÿ“š Read the Wasmtime guide here! ๐Ÿ“š

The wasmtime guide is the best starting point to learn about what Wasmtime can do for you or help answer your questions about Wasmtime. If you're curious in contributing to Wasmtime, it can also help you do that!


It's Wasmtime.

Comments
  • add riscv64 backend for cranelift.

    add riscv64 backend for cranelift.

    I am been trying to add riscv64 backend for cranelift these days. right now I have pass all run test in filetests.

    some features not implemented right now. i128 mul div rem, all simd type and compare overflow.

    some test need platform support. like bitrev need qemu-riscv64 support bitmanip and zbkb extension (don't know how to enable it.).

    cranelift cranelift:meta cranelift:area:aarch64 
    opened by yuyang-ok 80
  • `wasmtime`: Implement fast Wasm stack walking

    `wasmtime`: Implement fast Wasm stack walking

    Why do we want Wasm stack walking to be fast? Because we capture stacks whenever there is a trap and traps actually happen fairly frequently with short-lived programs and WASI's exit.

    Previously, we would rely on generating the system unwind info (e.g. .eh_frame) and using the system unwinder (via the backtracecrate) to walk the full stack and filter out any non-Wasm stack frames. This can, unfortunately, be slow for two primary reasons:

    1. The system unwinder is doing O(all-kinds-of-frames) work rather than O(wasm-frames) work.

    2. System unwind info and the system unwinder need to be much more general than a purpose-built stack walker for Wasm needs to be. It has to handle any kind of stack frame that any compiler might emit where as our Wasm frames are emitted by Cranelift and always have frame pointers. This translates into implementation complexity and general overhead. There can also be unnecessary-for-our-use-cases global synchronization and locks involved, further slowing down stack walking in the presence of multiple threads trying to capture stacks in parallel.

    This commit introduces a purpose-built stack walker for traversing just our Wasm frames. To find all the sequences of Wasm-to-Wasm stack frames, and ignore non-Wasm stack frames, we keep a linked list of (entry stack pointer, exit frame pointer) pairs. This linked list is maintained via Wasm-to-host and host-to-Wasm trampolines. Within a sequence of Wasm-to-Wasm calls, we can use frame pointers (which Cranelift preserves) to find the next older Wasm frame on the stack, and we keep doing this until we reach the entry stack pointer, meaning that the next older frame will be a host frame.

    The trampolines need to avoid a couple stumbling blocks. First, they need to be compiled ahead of time, since we may not have access to a compiler at runtime (e.g. if the cranelift feature is disabled) but still want to be able to call functions that have already been compiled and get stack traces for those functions. Usually this means we would compile the appropriate trampolines inside Module::new and the compiled module object would hold the trampolines. However, we also need to support calling host functions that are wrapped into wasmtime::Funcs and there doesn't exist any ahead-of-time compiled module object to hold the appropriate trampolines:

    // Define a host function.
    let func_type = wasmtime::FuncType::new(
        vec![wasmtime::ValType::I32],
        vec![wasmtime::ValType::I32],
    );
    let func = Func::new(&mut store, func_type, |_, params, results| {
        // ...
        Ok(())
    });
    
    // Call that host function.
    let mut results = vec![wasmtime::Val::I32(0)];
    func.call(&[wasmtime::Val::I32(0)], &mut results)?;
    

    Therefore, we define one host-to-Wasm trampoline and one Wasm-to-host trampoline in assembly that work for all Wasm and host function signatures. These trampolines are careful to only use volatile registers, avoid touching any register that is an argument in the calling convention ABI, and tail call to the target callee function. This allows forwarding any set of arguments and any returns to and from the callee, while also allowing us to maintain our linked list of Wasm stack and frame pointers before transferring control to the callee. These trampolines are not used in Wasm-to-Wasm calls, only when crossing the host-Wasm boundary, so they do not impose overhead on regular calls. (And if using one trampoline for all host-Wasm boundary crossing ever breaks branch prediction enough in the CPU to become any kind of bottleneck, we can do fun things like have multiple copies of the same trampoline and choose a random copy for each function, sharding the functions across branch predictor entries.)

    Finally, this commit also ends the use of a synthetic Module and allocating a stubbed out VMContext for host functions. Instead, we define a VMHostFuncContext with its own magic value, similar to VMComponentContext, specifically for host functions.

    Benchmarks

    Traps and Stack Traces

    Large improvements to taking stack traces on traps, ranging from shaving off 64% to 99.95% of the time it used to take.

    multi-threaded-traps/0  time:   [2.5686 us 2.5808 us 2.5934 us]
                            thrpt:  [0.0000  elem/s 0.0000  elem/s 0.0000  elem/s]
                     change:
                            time:   [-85.419% -85.153% -84.869%] (p = 0.00 < 0.05)
                            thrpt:  [+560.90% +573.56% +585.84%]
                            Performance has improved.
    Found 8 outliers among 100 measurements (8.00%)
      4 (4.00%) high mild
      4 (4.00%) high severe
    multi-threaded-traps/1  time:   [2.9021 us 2.9167 us 2.9322 us]
                            thrpt:  [341.04 Kelem/s 342.86 Kelem/s 344.58 Kelem/s]
                     change:
                            time:   [-91.455% -91.294% -91.096%] (p = 0.00 < 0.05)
                            thrpt:  [+1023.1% +1048.6% +1070.3%]
                            Performance has improved.
    Found 6 outliers among 100 measurements (6.00%)
      1 (1.00%) high mild
      5 (5.00%) high severe
    multi-threaded-traps/2  time:   [2.9996 us 3.0145 us 3.0295 us]
                            thrpt:  [660.18 Kelem/s 663.47 Kelem/s 666.76 Kelem/s]
                     change:
                            time:   [-94.040% -93.910% -93.762%] (p = 0.00 < 0.05)
                            thrpt:  [+1503.1% +1542.0% +1578.0%]
                            Performance has improved.
    Found 5 outliers among 100 measurements (5.00%)
      5 (5.00%) high severe
    multi-threaded-traps/4  time:   [5.5768 us 5.6052 us 5.6364 us]
                            thrpt:  [709.68 Kelem/s 713.63 Kelem/s 717.25 Kelem/s]
                     change:
                            time:   [-93.193% -93.121% -93.052%] (p = 0.00 < 0.05)
                            thrpt:  [+1339.2% +1353.6% +1369.1%]
                            Performance has improved.
    multi-threaded-traps/8  time:   [8.6408 us 9.1212 us 9.5438 us]
                            thrpt:  [838.24 Kelem/s 877.08 Kelem/s 925.84 Kelem/s]
                     change:
                            time:   [-94.754% -94.473% -94.202%] (p = 0.00 < 0.05)
                            thrpt:  [+1624.7% +1709.2% +1806.1%]
                            Performance has improved.
    multi-threaded-traps/16 time:   [10.152 us 10.840 us 11.545 us]
                            thrpt:  [1.3858 Melem/s 1.4760 Melem/s 1.5761 Melem/s]
                     change:
                            time:   [-97.042% -96.823% -96.577%] (p = 0.00 < 0.05)
                            thrpt:  [+2821.5% +3048.1% +3281.1%]
                            Performance has improved.
    Found 1 outliers among 100 measurements (1.00%)
      1 (1.00%) high mild
    
    many-modules-registered-traps/1
                            time:   [2.6278 us 2.6361 us 2.6447 us]
                            thrpt:  [378.11 Kelem/s 379.35 Kelem/s 380.55 Kelem/s]
                     change:
                            time:   [-85.311% -85.108% -84.909%] (p = 0.00 < 0.05)
                            thrpt:  [+562.65% +571.51% +580.76%]
                            Performance has improved.
    Found 9 outliers among 100 measurements (9.00%)
      3 (3.00%) high mild
      6 (6.00%) high severe
    many-modules-registered-traps/8
                            time:   [2.6294 us 2.6460 us 2.6623 us]
                            thrpt:  [3.0049 Melem/s 3.0235 Melem/s 3.0425 Melem/s]
                     change:
                            time:   [-85.895% -85.485% -85.022%] (p = 0.00 < 0.05)
                            thrpt:  [+567.63% +588.95% +608.95%]
                            Performance has improved.
    Found 8 outliers among 100 measurements (8.00%)
      3 (3.00%) high mild
      5 (5.00%) high severe
    many-modules-registered-traps/64
                            time:   [2.6218 us 2.6329 us 2.6452 us]
                            thrpt:  [24.195 Melem/s 24.308 Melem/s 24.411 Melem/s]
                     change:
                            time:   [-93.629% -93.551% -93.470%] (p = 0.00 < 0.05)
                            thrpt:  [+1431.4% +1450.6% +1469.5%]
                            Performance has improved.
    Found 3 outliers among 100 measurements (3.00%)
      3 (3.00%) high mild
    many-modules-registered-traps/512
                            time:   [2.6569 us 2.6737 us 2.6923 us]
                            thrpt:  [190.17 Melem/s 191.50 Melem/s 192.71 Melem/s]
                     change:
                            time:   [-99.277% -99.268% -99.260%] (p = 0.00 < 0.05)
                            thrpt:  [+13417% +13566% +13731%]
                            Performance has improved.
    Found 4 outliers among 100 measurements (4.00%)
      4 (4.00%) high mild
    many-modules-registered-traps/4096
                            time:   [2.7258 us 2.7390 us 2.7535 us]
                            thrpt:  [1.4876 Gelem/s 1.4955 Gelem/s 1.5027 Gelem/s]
                     change:
                            time:   [-99.956% -99.955% -99.955%] (p = 0.00 < 0.05)
                            thrpt:  [+221417% +223380% +224881%]
                            Performance has improved.
    Found 2 outliers among 100 measurements (2.00%)
      1 (1.00%) high mild
      1 (1.00%) high severe
    
    many-stack-frames-traps/1
                            time:   [1.4658 us 1.4719 us 1.4784 us]
                            thrpt:  [676.39 Kelem/s 679.38 Kelem/s 682.21 Kelem/s]
                     change:
                            time:   [-90.368% -89.947% -89.586%] (p = 0.00 < 0.05)
                            thrpt:  [+860.23% +894.72% +938.21%]
                            Performance has improved.
    Found 8 outliers among 100 measurements (8.00%)
      5 (5.00%) high mild
      3 (3.00%) high severe
    many-stack-frames-traps/8
                            time:   [2.4772 us 2.4870 us 2.4973 us]
                            thrpt:  [3.2034 Melem/s 3.2167 Melem/s 3.2294 Melem/s]
                     change:
                            time:   [-85.550% -85.370% -85.199%] (p = 0.00 < 0.05)
                            thrpt:  [+575.65% +583.51% +592.03%]
                            Performance has improved.
    Found 8 outliers among 100 measurements (8.00%)
      4 (4.00%) high mild
      4 (4.00%) high severe
    many-stack-frames-traps/64
                            time:   [10.109 us 10.171 us 10.236 us]
                            thrpt:  [6.2525 Melem/s 6.2925 Melem/s 6.3309 Melem/s]
                     change:
                            time:   [-78.144% -77.797% -77.336%] (p = 0.00 < 0.05)
                            thrpt:  [+341.22% +350.38% +357.55%]
                            Performance has improved.
    Found 7 outliers among 100 measurements (7.00%)
      5 (5.00%) high mild
      2 (2.00%) high severe
    many-stack-frames-traps/512
                            time:   [126.16 us 126.54 us 126.96 us]
                            thrpt:  [4.0329 Melem/s 4.0461 Melem/s 4.0583 Melem/s]
                     change:
                            time:   [-65.364% -64.933% -64.453%] (p = 0.00 < 0.05)
                            thrpt:  [+181.32% +185.17% +188.71%]
                            Performance has improved.
    Found 4 outliers among 100 measurements (4.00%)
      4 (4.00%) high severe
    

    Calls

    There is, however, a small regression in raw Wasm-to-host and host-to-Wasm call performance due the new trampolines. It seems to be on the order of about 2-10 nanoseconds per call, depending on the benchmark.

    I believe this regression is ultimately acceptable because

    1. this overhead will be vastly dominated by whatever work a non-nop callee actually does,

    2. we will need these trampolines, or something like them, when implementing the Wasm exceptions proposal to do things like translate Wasm's exceptions into Rust's Results,

    3. and because the performance improvements to trapping and capturing stack traces are of such a larger magnitude than this call regressions.

    sync/no-hook/host-to-wasm - typed - nop
                            time:   [28.683 ns 28.757 ns 28.844 ns]
                            change: [+16.472% +17.183% +17.904%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 10 outliers among 100 measurements (10.00%)
      1 (1.00%) low mild
      4 (4.00%) high mild
      5 (5.00%) high severe
    sync/no-hook/host-to-wasm - untyped - nop
                            time:   [42.515 ns 42.652 ns 42.841 ns]
                            change: [+12.371% +14.614% +17.462%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 11 outliers among 100 measurements (11.00%)
      1 (1.00%) high mild
      10 (10.00%) high severe
    sync/no-hook/host-to-wasm - unchecked - nop
                            time:   [33.936 ns 34.052 ns 34.179 ns]
                            change: [+25.478% +26.938% +28.369%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 9 outliers among 100 measurements (9.00%)
      7 (7.00%) high mild
      2 (2.00%) high severe
    sync/no-hook/host-to-wasm - typed - nop-params-and-results
                            time:   [34.290 ns 34.388 ns 34.502 ns]
                            change: [+40.802% +42.706% +44.526%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 13 outliers among 100 measurements (13.00%)
      5 (5.00%) high mild
      8 (8.00%) high severe
    sync/no-hook/host-to-wasm - untyped - nop-params-and-results
                            time:   [62.546 ns 62.721 ns 62.919 ns]
                            change: [+2.5014% +3.6319% +4.8078%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 12 outliers among 100 measurements (12.00%)
      2 (2.00%) high mild
      10 (10.00%) high severe
    sync/no-hook/host-to-wasm - unchecked - nop-params-and-results
                            time:   [42.609 ns 42.710 ns 42.831 ns]
                            change: [+20.966% +22.282% +23.475%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 11 outliers among 100 measurements (11.00%)
      4 (4.00%) high mild
      7 (7.00%) high severe
    
    sync/hook-sync/host-to-wasm - typed - nop
                            time:   [29.546 ns 29.675 ns 29.818 ns]
                            change: [+20.693% +21.794% +22.836%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 5 outliers among 100 measurements (5.00%)
      3 (3.00%) high mild
      2 (2.00%) high severe
    sync/hook-sync/host-to-wasm - untyped - nop
                            time:   [45.448 ns 45.699 ns 45.961 ns]
                            change: [+17.204% +18.514% +19.590%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 14 outliers among 100 measurements (14.00%)
      4 (4.00%) high mild
      10 (10.00%) high severe
    sync/hook-sync/host-to-wasm - unchecked - nop
                            time:   [34.334 ns 34.437 ns 34.558 ns]
                            change: [+23.225% +24.477% +25.886%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 12 outliers among 100 measurements (12.00%)
      5 (5.00%) high mild
      7 (7.00%) high severe
    sync/hook-sync/host-to-wasm - typed - nop-params-and-results
                            time:   [36.594 ns 36.763 ns 36.974 ns]
                            change: [+41.967% +47.261% +52.086%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 12 outliers among 100 measurements (12.00%)
      3 (3.00%) high mild
      9 (9.00%) high severe
    sync/hook-sync/host-to-wasm - untyped - nop-params-and-results
                            time:   [63.541 ns 63.831 ns 64.194 ns]
                            change: [-4.4337% -0.6855% +2.7134%] (p = 0.73 > 0.05)
                            No change in performance detected.
    Found 8 outliers among 100 measurements (8.00%)
      6 (6.00%) high mild
      2 (2.00%) high severe
    sync/hook-sync/host-to-wasm - unchecked - nop-params-and-results
                            time:   [43.968 ns 44.169 ns 44.437 ns]
                            change: [+18.772% +21.802% +24.623%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 15 outliers among 100 measurements (15.00%)
      3 (3.00%) high mild
      12 (12.00%) high severe
    
    async/no-hook/host-to-wasm - typed - nop
                            time:   [4.9612 us 4.9743 us 4.9889 us]
                            change: [+9.9493% +11.911% +13.502%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 10 outliers among 100 measurements (10.00%)
      6 (6.00%) high mild
      4 (4.00%) high severe
    async/no-hook/host-to-wasm - untyped - nop
                            time:   [5.0030 us 5.0211 us 5.0439 us]
                            change: [+10.841% +11.873% +12.977%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 10 outliers among 100 measurements (10.00%)
      3 (3.00%) high mild
      7 (7.00%) high severe
    async/no-hook/host-to-wasm - typed - nop-params-and-results
                            time:   [4.9273 us 4.9468 us 4.9700 us]
                            change: [+4.7381% +6.8445% +8.8238%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 14 outliers among 100 measurements (14.00%)
      5 (5.00%) high mild
      9 (9.00%) high severe
    async/no-hook/host-to-wasm - untyped - nop-params-and-results
                            time:   [5.1151 us 5.1338 us 5.1555 us]
                            change: [+9.5335% +11.290% +13.044%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 16 outliers among 100 measurements (16.00%)
      3 (3.00%) high mild
      13 (13.00%) high severe
    
    async/hook-sync/host-to-wasm - typed - nop
                            time:   [4.9330 us 4.9394 us 4.9467 us]
                            change: [+10.046% +11.038% +12.035%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 12 outliers among 100 measurements (12.00%)
      5 (5.00%) high mild
      7 (7.00%) high severe
    async/hook-sync/host-to-wasm - untyped - nop
                            time:   [5.0073 us 5.0183 us 5.0310 us]
                            change: [+9.3828% +10.565% +11.752%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 8 outliers among 100 measurements (8.00%)
      3 (3.00%) high mild
      5 (5.00%) high severe
    async/hook-sync/host-to-wasm - typed - nop-params-and-results
                            time:   [4.9610 us 4.9839 us 5.0097 us]
                            change: [+9.0857% +11.513% +14.359%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 13 outliers among 100 measurements (13.00%)
      7 (7.00%) high mild
      6 (6.00%) high severe
    async/hook-sync/host-to-wasm - untyped - nop-params-and-results
                            time:   [5.0995 us 5.1272 us 5.1617 us]
                            change: [+9.3600% +11.506% +13.809%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 10 outliers among 100 measurements (10.00%)
      6 (6.00%) high mild
      4 (4.00%) high severe
    
    async-pool/no-hook/host-to-wasm - typed - nop
                            time:   [2.4242 us 2.4316 us 2.4396 us]
                            change: [+7.8756% +8.8803% +9.8346%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 8 outliers among 100 measurements (8.00%)
      5 (5.00%) high mild
      3 (3.00%) high severe
    async-pool/no-hook/host-to-wasm - untyped - nop
                            time:   [2.5102 us 2.5155 us 2.5210 us]
                            change: [+12.130% +13.194% +14.270%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 12 outliers among 100 measurements (12.00%)
      4 (4.00%) high mild
      8 (8.00%) high severe
    async-pool/no-hook/host-to-wasm - typed - nop-params-and-results
                            time:   [2.4203 us 2.4310 us 2.4440 us]
                            change: [+4.0380% +6.3623% +8.7534%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 14 outliers among 100 measurements (14.00%)
      5 (5.00%) high mild
      9 (9.00%) high severe
    async-pool/no-hook/host-to-wasm - untyped - nop-params-and-results
                            time:   [2.5501 us 2.5593 us 2.5700 us]
                            change: [+8.8802% +10.976% +12.937%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 16 outliers among 100 measurements (16.00%)
      5 (5.00%) high mild
      11 (11.00%) high severe
    
    async-pool/hook-sync/host-to-wasm - typed - nop
                            time:   [2.4135 us 2.4190 us 2.4254 us]
                            change: [+8.3640% +9.3774% +10.435%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 11 outliers among 100 measurements (11.00%)
      6 (6.00%) high mild
      5 (5.00%) high severe
    async-pool/hook-sync/host-to-wasm - untyped - nop
                            time:   [2.5172 us 2.5248 us 2.5357 us]
                            change: [+11.543% +12.750% +13.982%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 8 outliers among 100 measurements (8.00%)
      1 (1.00%) high mild
      7 (7.00%) high severe
    async-pool/hook-sync/host-to-wasm - typed - nop-params-and-results
                            time:   [2.4214 us 2.4353 us 2.4532 us]
                            change: [+1.5158% +5.0872% +8.6765%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 15 outliers among 100 measurements (15.00%)
      2 (2.00%) high mild
      13 (13.00%) high severe
    async-pool/hook-sync/host-to-wasm - untyped - nop-params-and-results
                            time:   [2.5499 us 2.5607 us 2.5748 us]
                            change: [+10.146% +12.459% +14.919%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 18 outliers among 100 measurements (18.00%)
      3 (3.00%) high mild
      15 (15.00%) high severe
    
    sync/no-hook/wasm-to-host - nop - typed
                            time:   [6.6135 ns 6.6288 ns 6.6452 ns]
                            change: [+37.927% +38.837% +39.869%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 7 outliers among 100 measurements (7.00%)
      2 (2.00%) high mild
      5 (5.00%) high severe
    sync/no-hook/wasm-to-host - nop-params-and-results - typed
                            time:   [15.930 ns 15.993 ns 16.067 ns]
                            change: [+3.9583% +5.6286% +7.2430%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 12 outliers among 100 measurements (12.00%)
      11 (11.00%) high mild
      1 (1.00%) high severe
    sync/no-hook/wasm-to-host - nop - untyped
                            time:   [20.596 ns 20.640 ns 20.690 ns]
                            change: [+4.3293% +5.2047% +6.0935%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 10 outliers among 100 measurements (10.00%)
      5 (5.00%) high mild
      5 (5.00%) high severe
    sync/no-hook/wasm-to-host - nop-params-and-results - untyped
                            time:   [42.659 ns 42.882 ns 43.159 ns]
                            change: [-2.1466% -0.5079% +1.2554%] (p = 0.58 > 0.05)
                            No change in performance detected.
    Found 15 outliers among 100 measurements (15.00%)
      1 (1.00%) high mild
      14 (14.00%) high severe
    sync/no-hook/wasm-to-host - nop - unchecked
                            time:   [10.671 ns 10.691 ns 10.713 ns]
                            change: [+83.911% +87.620% +92.062%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 9 outliers among 100 measurements (9.00%)
      2 (2.00%) high mild
      7 (7.00%) high severe
    sync/no-hook/wasm-to-host - nop-params-and-results - unchecked
                            time:   [11.136 ns 11.190 ns 11.263 ns]
                            change: [-29.719% -28.446% -27.029%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 14 outliers among 100 measurements (14.00%)
      4 (4.00%) high mild
      10 (10.00%) high severe
    
    sync/hook-sync/wasm-to-host - nop - typed
                            time:   [6.7964 ns 6.8087 ns 6.8226 ns]
                            change: [+21.531% +24.206% +27.331%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 14 outliers among 100 measurements (14.00%)
      4 (4.00%) high mild
      10 (10.00%) high severe
    sync/hook-sync/wasm-to-host - nop-params-and-results - typed
                            time:   [15.865 ns 15.921 ns 15.985 ns]
                            change: [+4.8466% +6.3330% +7.8317%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 16 outliers among 100 measurements (16.00%)
      3 (3.00%) high mild
      13 (13.00%) high severe
    sync/hook-sync/wasm-to-host - nop - untyped
                            time:   [21.505 ns 21.587 ns 21.677 ns]
                            change: [+8.0908% +9.1943% +10.254%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 8 outliers among 100 measurements (8.00%)
      4 (4.00%) high mild
      4 (4.00%) high severe
    sync/hook-sync/wasm-to-host - nop-params-and-results - untyped
                            time:   [44.018 ns 44.128 ns 44.261 ns]
                            change: [-1.4671% -0.0458% +1.2443%] (p = 0.94 > 0.05)
                            No change in performance detected.
    Found 14 outliers among 100 measurements (14.00%)
      5 (5.00%) high mild
      9 (9.00%) high severe
    sync/hook-sync/wasm-to-host - nop - unchecked
                            time:   [11.264 ns 11.326 ns 11.387 ns]
                            change: [+80.225% +81.659% +83.068%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 6 outliers among 100 measurements (6.00%)
      3 (3.00%) high mild
      3 (3.00%) high severe
    sync/hook-sync/wasm-to-host - nop-params-and-results - unchecked
                            time:   [11.816 ns 11.865 ns 11.920 ns]
                            change: [-29.152% -28.040% -26.957%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 14 outliers among 100 measurements (14.00%)
      8 (8.00%) high mild
      6 (6.00%) high severe
    
    async/no-hook/wasm-to-host - nop - typed
                            time:   [6.6221 ns 6.6385 ns 6.6569 ns]
                            change: [+43.618% +44.755% +45.965%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 13 outliers among 100 measurements (13.00%)
      6 (6.00%) high mild
      7 (7.00%) high severe
    async/no-hook/wasm-to-host - nop-params-and-results - typed
                            time:   [15.884 ns 15.929 ns 15.983 ns]
                            change: [+3.5987% +5.2053% +6.7846%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 16 outliers among 100 measurements (16.00%)
      3 (3.00%) high mild
      13 (13.00%) high severe
    async/no-hook/wasm-to-host - nop - untyped
                            time:   [20.615 ns 20.702 ns 20.821 ns]
                            change: [+6.9799% +8.1212% +9.2819%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 10 outliers among 100 measurements (10.00%)
      2 (2.00%) high mild
      8 (8.00%) high severe
    async/no-hook/wasm-to-host - nop-params-and-results - untyped
                            time:   [41.956 ns 42.207 ns 42.521 ns]
                            change: [-4.3057% -2.7730% -1.2428%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 14 outliers among 100 measurements (14.00%)
      3 (3.00%) high mild
      11 (11.00%) high severe
    async/no-hook/wasm-to-host - nop - unchecked
                            time:   [10.440 ns 10.474 ns 10.513 ns]
                            change: [+83.959% +85.826% +87.541%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 11 outliers among 100 measurements (11.00%)
      5 (5.00%) high mild
      6 (6.00%) high severe
    async/no-hook/wasm-to-host - nop-params-and-results - unchecked
                            time:   [11.476 ns 11.512 ns 11.554 ns]
                            change: [-29.857% -28.383% -26.978%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 12 outliers among 100 measurements (12.00%)
      1 (1.00%) low mild
      6 (6.00%) high mild
      5 (5.00%) high severe
    async/no-hook/wasm-to-host - nop - async-typed
                            time:   [26.427 ns 26.478 ns 26.532 ns]
                            change: [+6.5730% +7.4676% +8.3983%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 9 outliers among 100 measurements (9.00%)
      2 (2.00%) high mild
      7 (7.00%) high severe
    async/no-hook/wasm-to-host - nop-params-and-results - async-typed
                            time:   [28.557 ns 28.693 ns 28.880 ns]
                            change: [+1.9099% +3.7332% +5.9731%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 15 outliers among 100 measurements (15.00%)
      1 (1.00%) high mild
      14 (14.00%) high severe
    
    async/hook-sync/wasm-to-host - nop - typed
                            time:   [6.7488 ns 6.7630 ns 6.7784 ns]
                            change: [+19.935% +22.080% +23.683%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 9 outliers among 100 measurements (9.00%)
      4 (4.00%) high mild
      5 (5.00%) high severe
    async/hook-sync/wasm-to-host - nop-params-and-results - typed
                            time:   [15.928 ns 16.031 ns 16.149 ns]
                            change: [+5.5188% +6.9567% +8.3839%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 11 outliers among 100 measurements (11.00%)
      9 (9.00%) high mild
      2 (2.00%) high severe
    async/hook-sync/wasm-to-host - nop - untyped
                            time:   [21.930 ns 22.114 ns 22.296 ns]
                            change: [+4.6674% +7.7588% +10.375%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 4 outliers among 100 measurements (4.00%)
      3 (3.00%) high mild
      1 (1.00%) high severe
    async/hook-sync/wasm-to-host - nop-params-and-results - untyped
                            time:   [42.684 ns 42.858 ns 43.081 ns]
                            change: [-5.2957% -3.4693% -1.6217%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 14 outliers among 100 measurements (14.00%)
      2 (2.00%) high mild
      12 (12.00%) high severe
    async/hook-sync/wasm-to-host - nop - unchecked
                            time:   [11.026 ns 11.053 ns 11.086 ns]
                            change: [+70.751% +72.378% +73.961%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 10 outliers among 100 measurements (10.00%)
      5 (5.00%) high mild
      5 (5.00%) high severe
    async/hook-sync/wasm-to-host - nop-params-and-results - unchecked
                            time:   [11.840 ns 11.900 ns 11.982 ns]
                            change: [-27.977% -26.584% -24.887%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 18 outliers among 100 measurements (18.00%)
      3 (3.00%) high mild
      15 (15.00%) high severe
    async/hook-sync/wasm-to-host - nop - async-typed
                            time:   [27.601 ns 27.709 ns 27.882 ns]
                            change: [+8.1781% +9.1102% +10.030%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 11 outliers among 100 measurements (11.00%)
      2 (2.00%) low mild
      3 (3.00%) high mild
      6 (6.00%) high severe
    async/hook-sync/wasm-to-host - nop-params-and-results - async-typed
                            time:   [28.955 ns 29.174 ns 29.413 ns]
                            change: [+1.1226% +3.0366% +5.1126%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 13 outliers among 100 measurements (13.00%)
      7 (7.00%) high mild
      6 (6.00%) high severe
    
    async-pool/no-hook/wasm-to-host - nop - typed
                            time:   [6.5626 ns 6.5733 ns 6.5851 ns]
                            change: [+40.561% +42.307% +44.514%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 9 outliers among 100 measurements (9.00%)
      5 (5.00%) high mild
      4 (4.00%) high severe
    async-pool/no-hook/wasm-to-host - nop-params-and-results - typed
                            time:   [15.820 ns 15.886 ns 15.969 ns]
                            change: [+4.1044% +5.7928% +7.7122%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 17 outliers among 100 measurements (17.00%)
      4 (4.00%) high mild
      13 (13.00%) high severe
    async-pool/no-hook/wasm-to-host - nop - untyped
                            time:   [20.481 ns 20.521 ns 20.566 ns]
                            change: [+6.7962% +7.6950% +8.7612%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 11 outliers among 100 measurements (11.00%)
      6 (6.00%) high mild
      5 (5.00%) high severe
    async-pool/no-hook/wasm-to-host - nop-params-and-results - untyped
                            time:   [41.834 ns 41.998 ns 42.189 ns]
                            change: [-3.8185% -2.2687% -0.7541%] (p = 0.01 < 0.05)
                            Change within noise threshold.
    Found 13 outliers among 100 measurements (13.00%)
      3 (3.00%) high mild
      10 (10.00%) high severe
    async-pool/no-hook/wasm-to-host - nop - unchecked
                            time:   [10.353 ns 10.380 ns 10.414 ns]
                            change: [+82.042% +84.591% +87.205%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 7 outliers among 100 measurements (7.00%)
      4 (4.00%) high mild
      3 (3.00%) high severe
    async-pool/no-hook/wasm-to-host - nop-params-and-results - unchecked
                            time:   [11.123 ns 11.168 ns 11.228 ns]
                            change: [-30.813% -29.285% -27.874%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 12 outliers among 100 measurements (12.00%)
      11 (11.00%) high mild
      1 (1.00%) high severe
    async-pool/no-hook/wasm-to-host - nop - async-typed
                            time:   [27.442 ns 27.528 ns 27.638 ns]
                            change: [+7.5215% +9.9795% +12.266%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 18 outliers among 100 measurements (18.00%)
      3 (3.00%) high mild
      15 (15.00%) high severe
    async-pool/no-hook/wasm-to-host - nop-params-and-results - async-typed
                            time:   [29.014 ns 29.148 ns 29.312 ns]
                            change: [+2.0227% +3.4722% +4.9047%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 7 outliers among 100 measurements (7.00%)
      6 (6.00%) high mild
      1 (1.00%) high severe
    
    async-pool/hook-sync/wasm-to-host - nop - typed
                            time:   [6.7916 ns 6.8116 ns 6.8325 ns]
                            change: [+20.937% +22.050% +23.281%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 11 outliers among 100 measurements (11.00%)
      5 (5.00%) high mild
      6 (6.00%) high severe
    async-pool/hook-sync/wasm-to-host - nop-params-and-results - typed
                            time:   [15.917 ns 15.975 ns 16.051 ns]
                            change: [+4.6404% +6.4217% +8.3075%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 16 outliers among 100 measurements (16.00%)
      5 (5.00%) high mild
      11 (11.00%) high severe
    async-pool/hook-sync/wasm-to-host - nop - untyped
                            time:   [21.558 ns 21.612 ns 21.679 ns]
                            change: [+8.1158% +9.1409% +10.217%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 9 outliers among 100 measurements (9.00%)
      2 (2.00%) high mild
      7 (7.00%) high severe
    async-pool/hook-sync/wasm-to-host - nop-params-and-results - untyped
                            time:   [42.475 ns 42.614 ns 42.775 ns]
                            change: [-6.3613% -4.4709% -2.7647%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 18 outliers among 100 measurements (18.00%)
      3 (3.00%) high mild
      15 (15.00%) high severe
    async-pool/hook-sync/wasm-to-host - nop - unchecked
                            time:   [11.150 ns 11.195 ns 11.247 ns]
                            change: [+74.424% +77.056% +79.811%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 14 outliers among 100 measurements (14.00%)
      3 (3.00%) high mild
      11 (11.00%) high severe
    async-pool/hook-sync/wasm-to-host - nop-params-and-results - unchecked
                            time:   [11.639 ns 11.695 ns 11.760 ns]
                            change: [-30.212% -29.023% -27.954%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 15 outliers among 100 measurements (15.00%)
      7 (7.00%) high mild
      8 (8.00%) high severe
    async-pool/hook-sync/wasm-to-host - nop - async-typed
                            time:   [27.480 ns 27.712 ns 27.984 ns]
                            change: [+2.9764% +6.5061% +9.8914%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 8 outliers among 100 measurements (8.00%)
      6 (6.00%) high mild
      2 (2.00%) high severe
    async-pool/hook-sync/wasm-to-host - nop-params-and-results - async-typed
                            time:   [29.218 ns 29.380 ns 29.600 ns]
                            change: [+5.2283% +7.7247% +10.822%] (p = 0.00 < 0.05)
                            Performance has regressed.
    Found 16 outliers among 100 measurements (16.00%)
      2 (2.00%) high mild
      14 (14.00%) high severe
    
    wasmtime:api cranelift cranelift:area:machinst cranelift:meta fuzzing cranelift:area:aarch64 cranelift:area:x64 wasmtime:ref-types wasmtime:config 
    opened by fitzgen 40
  • wasm64 support

    wasm64 support

    We should consider supporting wasm64 modules, not just wasm32; people will want to run with large linear address spaces, both to process large amounts of data and to provide address space for shared mappings or file mappings.

    Opening this issue to start discussing what that support should look like, and how we can do that with minimal complexity or duplication.

    opened by joshtriplett 40
  • externref: implement stack map-based garbage collection

    externref: implement stack map-based garbage collection

    For host VM code, we use plain reference counting, where cloning increments the reference count, and dropping decrements it. We can avoid many of the on-stack increment/decrement operations that typically plague the performance of reference counting via Rust's ownership and borrowing system. Moving a VMExternRef avoids mutating its reference count, and borrowing it either avoids the reference count increment or delays it until if/when the VMExternRef is cloned.

    When passing a VMExternRef into compiled Wasm code, we don't want to do reference count mutations for every compiled local.{get,set}, nor for every function call. Therefore, we use a variation of deferred reference counting, where we only mutate reference counts when storing VMExternRefs somewhere that outlives the activation: into a global or table. Simultaneously, we over-approximate the set of VMExternRefs that are inside Wasm function activations. Periodically, we walk the stack at GC safe points, and use stack map information to precisely identify the set of VMExternRefs inside Wasm activations. Then we take the difference between this precise set and our over-approximation, and decrement the reference count for each of the VMExternRefs that are in our over-approximation but not in the precise set. Finally, the over-approximation is replaced with the precise set.

    The VMExternRefActivationsTable implements the over-approximized set of VMExternRefs referenced by Wasm activations. Calling a Wasm function and passing it a VMExternRef moves the VMExternRef into the table, and the compiled Wasm function logically "borrows" the VMExternRef from the table. Similarly, global.get and table.get operations clone the gotten VMExternRef into the VMExternRefActivationsTable and then "borrow" the reference out of the table.

    When a VMExternRef is returned to host code from a Wasm function, the host increments the reference count (because the reference is logically "borrowed" from the VMExternRefActivationsTable and the reference count from the table will be dropped at the next GC).

    For more general information on deferred reference counting, see An Examination of Deferred Reference Counting and Cycle Detection by Quinane: https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf

    cc #929

    Fixes #1804

    Depends on https://github.com/rust-lang/backtrace-rs/pull/341

    wasmtime:api cranelift 
    opened by fitzgen 33
  • [meta] Migrate instruction selection to ISLE

    [meta] Migrate instruction selection to ISLE

    This is a meta issue to track the migration from hand-written instruction selection and lowering over to using the ISLE DSL.

    As you port lowering for a clif opcode over to ISLE, please check the associated box (or leave a comment, if you don't have edit permissions and I or someone else can check the box for you). Hopefully this will help us focus our porting efforts and finish the migration in a timely manner, as well as avoid stepping on each others toes by having two people accidentally port the same opcode lowerings.

    cc @alexcrichton @cfallin @abrown @jlb6740 @uweigand @sparker-arm @akirilov-arm

    x86_64 -- DONE!

    • [x] Opcode::Clz
    • [x] Opcode::Ctz
    • [x] Opcode::Popcnt
    • [x] Opcode::Bitrev
    • [x] Opcode::IsNull
    • [x] Opcode::IsInvalid
    • [x] Opcode::Uextend
    • [x] Opcode::Sextend
    • [x] Opcode::Breduce
    • [x] Opcode::Bextend
    • [x] Opcode::Ireduce
    • [x] Opcode::Bint
    • [x] Opcode::Icmp
    • [x] Opcode::Fcmp
    • [x] Opcode::FallthroughReturn
    • [x] Opcode::Return
    • [x] Opcode::Call
    • [x] Opcode::CallIndirect
    • [x] Opcode::Debugtrap
    • [x] Opcode::Trapif
    • [x] Opcode::Trapff
    • [x] Opcode::WideningPairwiseDotProductS
    • [x] Opcode::Fadd
    • [x] Opcode::Fsub
    • [x] Opcode::Fmul
    • [x] Opcode::Fdiv
    • [x] Opcode::Fmin
    • [x] Opcode::Fmax
    • [x] Opcode::FminPseudo
    • [x] Opcode::FmaxPseudo
    • [x] Opcode::Sqrt
    • [x] Opcode::Fpromote
    • [x] Opcode::FvpromoteLow
    • [x] Opcode::Fdemote
    • [x] Opcode::Fvdemote
    • [x] Opcode::FcvtFromSint
    • [x] Opcode::FcvtLowFromSint
    • [x] Opcode::FcvtFromUint
    • [x] Opcode::FcvtToUint
    • [x] Opcode::FcvtToUintSat
    • [x] Opcode::FcvtToSint
    • [x] Opcode::FcvtToSintSat
    • [x] Opcode::IaddPairwise
    • [x] Opcode::UwidenHigh
    • [x] Opcode::UwidenLow
    • [x] Opcode::SwidenHigh
    • [x] Opcode::SwidenLow
    • [x] Opcode::Snarrow
    • [x] Opcode::Unarrow
    • [x] Opcode::Bitcast
    • [x] Opcode::Fabs
    • [x] Opcode::Fneg
    • [x] Opcode::Fcopysign
    • [x] Opcode::Ceil
    • [x] Opcode::Floor
    • [x] Opcode::Nearest
    • [x] Opcode::Trunc
    • [x] Opcode::Load
    • [x] Opcode::Uload8
    • [x] Opcode::Sload8
    • [x] Opcode::Uload16
    • [x] Opcode::Sload16
    • [x] Opcode::Uload32
    • [x] Opcode::Sload32
    • [x] Opcode::Sload8x8
    • [x] Opcode::Uload8x8
    • [x] Opcode::Sload16x4
    • [x] Opcode::Uload16x4
    • [x] Opcode::Sload32x2
    • [x] Opcode::Uload32x2
    • [x] Opcode::Store
    • [x] Opcode::Istore8
    • [x] Opcode::Istore16
    • [x] Opcode::Istore32
    • [x] Opcode::AtomicRmw
    • [x] Opcode::AtomicCas
    • [x] Opcode::AtomicLoad
    • [x] Opcode::AtomicStore
    • [x] Opcode::Fence
    • [x] Opcode::FuncAddr
    • [x] Opcode::SymbolValue
    • [x] Opcode::StackAddr
    • [x] Opcode::Select
    • [x] Opcode::Selectif
    • [x] Opcode::SelectifSpectreGuard
    • [x] Opcode::Udiv
    • [x] Opcode::Urem
    • [x] Opcode::Sdiv
    • [x] Opcode::Srem
    • [x] Opcode::Umulhi
    • [x] Opcode::Smulhi
    • [x] Opcode::GetPinnedReg
    • [x] Opcode::SetPinnedReg
    • [x] Opcode::Vconst
    • [x] Opcode::RawBitcast
    • [x] Opcode::Shuffle
    • [x] Opcode::Swizzle
    • [x] Opcode::Insertlane
    • [x] Opcode::Extractlane
    • [x] Opcode::ScalarToVector
    • [x] Opcode::Splat
    • [x] Opcode::VanyTrue
    • [x] Opcode::VallTrue
    • [x] Opcode::VhighBits
    • [x] Opcode::Iconcat
    • [x] Opcode::Isplit
    • [x] Opcode::TlsValue
    • [x] Opcode::SqmulRoundSat
    • [x] Opcode::Uunarrow

    aarch64 -- DONE!

    • [x] Opcode::Load
    • [x] Opcode::Uload8
    • [x] Opcode::Sload8
    • [x] Opcode::Uload16
    • [x] Opcode::Sload16
    • [x] Opcode::Uload32
    • [x] Opcode::Sload32
    • [x] Opcode::Sload8x8
    • [x] Opcode::Uload8x8
    • [x] Opcode::Sload16x4
    • [x] Opcode::Uload16x4
    • [x] Opcode::Sload32x2
    • [x] Opcode::Uload32x2
    • [x] Opcode::Store
    • [x] Opcode::Istore8
    • [x] Opcode::Istore16
    • [x] Opcode::Istore32
    • [x] Opcode::StackAddr
    • [x] Opcode::AtomicRmw
    • [x] Opcode::AtomicCas
    • [x] Opcode::AtomicLoad
    • [x] Opcode::AtomicStore
    • [x] Opcode::Fence
    • [x] Opcode::Select
    • [x] Opcode::Selectif
    • [x] Opcode::SelectifSpectreGuard
    • [x] Opcode::Bitselect
    • [x] Opcode::Vselect
    • [x] Opcode::Trueif
    • [x] Opcode::Trueff
    • [x] Opcode::IsNull
    • [x] Opcode::IsInvalid
    • [x] Opcode::Copy
    • [x] Opcode::Breduce
    • [x] Opcode::Ireduce
    • [x] Opcode::Bextend
    • [x] Opcode::Bmask
    • [x] Opcode::Bint
    • [x] Opcode::Bitcast
    • [x] Opcode::FallthroughReturn
    • [x] Opcode::Return
    • [x] Opcode::Icmp
    • [x] Opcode::Fcmp
    • [x] Opcode::Debugtrap
    • [x] Opcode::Trap
    • [x] Opcode::ResumableTrap
    • [x] Opcode::Trapif
    • [x] Opcode::Trapff
    • [x] Opcode::FuncAddr
    • [x] Opcode::SymbolValue
    • [x] Opcode::Call
    • [x] Opcode::CallIndirect
    • [x] Opcode::GetPinnedReg
    • [x] Opcode::SetPinnedReg
    • [x] Opcode::Vconst
    • [x] Opcode::RawBitcast
    • [x] Opcode::Extractlane
    • [x] Opcode::Insertlane
    • [x] Opcode::Splat
    • [x] Opcode::ScalarToVector
    • [x] Opcode::VallTrue
    • [x] Opcode::VanyTrue
    • [x] Opcode::VhighBits
    • [x] Opcode::Shuffle
    • [x] Opcode::Swizzle
    • [x] Opcode::Isplit
    • [x] Opcode::Iconcat
    • [x] Opcode::Imax
    • [x] Opcode::Umax
    • [x] Opcode::Umin
    • [x] Opcode::Imin
    • [x] Opcode::IaddPairwise
    • [x] Opcode::WideningPairwiseDotProductS
    • [x] Opcode::Fadd
    • [x] Opcode::Fsub
    • [x] Opcode::Fmul
    • [x] Opcode::Fdiv
    • [x] Opcode::Fmin
    • [x] Opcode::Fmax
    • [x] Opcode::FminPseudo
    • [x] Opcode::FmaxPseudo
    • [x] Opcode::Sqrt
    • [x] Opcode::Fneg
    • [x] Opcode::Fabs
    • [x] Opcode::Fpromote
    • [x] Opcode::Fdemote
    • [x] Opcode::Ceil
    • [x] Opcode::Floor
    • [x] Opcode::Trunc
    • [x] Opcode::Nearest
    • [x] Opcode::Fma
    • [x] Opcode::Fcopysign
    • [x] Opcode::FcvtToUint
    • [x] Opcode::FcvtToSint
    • [x] Opcode::FcvtFromUint
    • [x] Opcode::FcvtFromSint
    • [x] Opcode::FcvtToUintSat
    • [x] Opcode::FcvtToSintSat
    • [x] Opcode::IaddIfcout
    • [x] Opcode::Iabs
    • [x] Opcode::AvgRound
    • [x] Opcode::Snarrow
    • [x] Opcode::Unarrow
    • [x] Opcode::Uunarrow
    • [x] Opcode::SwidenLow
    • [x] Opcode::SwidenHigh
    • [x] Opcode::UwidenLow
    • [x] Opcode::UwidenHigh
    • [x] Opcode::TlsValue
    • [x] Opcode::SqmulRoundSato
    • [x] Opcode::FcvtLowFromSint
    • [x] Opcode::FvpromoteLow
    • [x] Opcode::Fvdemote
    • [x] Branches

    s390x -- DONE!

    • [x] Calls
    • [x] Returns
    • [x] Traps
    • [x] Branches
    cranelift cranelift:E-compiler cranelift:area:aarch64 cranelift:area:x64 isle cranelift:area:s390x 
    opened by fitzgen 32
  • memfd/madvise-based CoW pooling allocator

    memfd/madvise-based CoW pooling allocator

    Add a pooling allocator mode based on copy-on-write mappings of memfds.

    As first suggested by Jan on the Zulip here [1], a cheap and effective way to obtain copy-on-write semantics of a "backing image" for a Wasm memory is to mmap a file with MAP_PRIVATE. The memfd mechanism provided by the Linux kernel allows us to create anonymous, in-memory-only files that we can use for this mapping, so we can construct the image contents on-the-fly then effectively create a CoW overlay. Furthermore, and importantly, madvise(MADV_DONTNEED, ...) will discard the CoW overlay, returning the mapping to its original state.

    By itself this is almost enough for a very fast instantiation-termination loop of the same image over and over, without changing the address space mapping at all (which is expensive). The only missing bit is how to implement heap growth. But here memfds can help us again: if we create another anonymous file and map it where the extended parts of the heap would go, we can take advantage of the fact that a mmap() mapping can be larger than the file itself, with accesses beyond the end generating a SIGBUS, and the fact that we can cheaply resize the file with ftruncate, even after a mapping exists. So we can map the "heap extension" file once with the maximum memory-slot size and grow the memfd itself as memory.grow operations occur.

    The above CoW technique and heap-growth technique together allow us a fastpath of madvise() and ftruncate() only when we re-instantiate the same module over and over, as long as we can reuse the same slot. This fastpath avoids all whole-process address-space locks in the Linux kernel, which should mean it is highly scalable. It also avoids the cost of copying data on read, as the uffd heap backend does when servicing pagefaults; the kernel's own optimized CoW logic (same as used by all file mmaps) is used instead.

    There are still a few loose ends in this PR, which I intend to tie up before merging:

    • There is no InstanceAllocationStrategy yet that attempts to actually reuse instance slots; that should be added ASAP. For testing so far, I have just instantiated the same one module repeatedly (so reuse naturally occurs).

    • The guard-page strategy is slightly wrong; I need to implement the pre-heap guard region as well. This will be done by performing another mapping once, to reserve the whole address range, then mmap'ing the image and extension file on top at appropriate offsets (2GiB, 2GiB plus image size).

    Thanks to Jan on Zulip (are you also @koute from #3691?) for the initial idea/inspiration! This PR is meant to demonstrate my thoughts on how to build the feature and spawn discussion; now that we see both approaches hopefully we can work out a way to meet the needs of both of our use-cases.

    [1] https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/Copy.20on.20write.20based.20instance.20reuse/near/266657772

    wasmtime:api 
    opened by cfallin 32
  • support a few DWARF-5 only features

    support a few DWARF-5 only features

    See #932.

    • accept and pass DebugAddrIndex, DebugStrOffsetsIndex attributes
    • skip DebugAddrBase, DebugStrOffsetsBase attribute when transforming, these are managed by the compilation unit elsewhere
    • accept and resolve DebugLineStrRef in line programs
    • read .debug_addr
    • read .debug_rnglists
    • read .debug_loclists
    • read .debug_line_str
    • read .debug_str_offsets
    • perform the DebugAddrIndex and DebugStrOffsetsIndex indirections

    TODO:

    • [x] tests (added DWARF-5 test, but it needs a refresh, lldb test also needed).
    opened by ggreif 31
  • Implement path_link for Windows.

    Implement path_link for Windows.

    This is probably the last missing syscall for Windows!

    This PR implements path_link for Windows and adds a non-strict version of the path_link integration test.

    I'm unsure about the error handling in path_link. MSDN doesn't say much about possible error codes for either CreateHardLinkA or CreateSymbolicLinkA. I mostly copied over the error conversion from path_symlink, but I'm not sure if it's correct. In particular, it's unclear to me what the purpose of strip_trailing_slashes_and_concatenate is.

    path_symlink will now also detect an attempt to create a dangling symlink and return ENOTSUP. (is this the correct return code)?

    Currently the non-strictness of the test consists of:

    1. we use a separate subdirectories subdir, subdir2, subdir3 for each test stage. This is due to the fact Windows will not remove the directory and won't allow to create a directory with the same name until the previous one has been deleted. I don't see any way of circumventing it, because the application may still try to access the directory through the unclosed file descriptor.
    2. path_link will return EACCES instead of EPERM when trying to create a link to a subdirectory. This violates the POSIX spec. We could manually check if the source path is a directory in case of ERROR_ACCESS_DENIED but this would cost us an extra syscall.
    3. Tests for dangling symlinks or symlink loops have been disabled. Alternatively, we could check if the attempt to create a dangling symlink returns ENOTSUP, but this doesn't make much sense while 1&2 are an issue.

    Let me know what you think.

    Btw. @kubkon, according to this stackoverflow post Mac OS X 10.5+ permits hard links to directories, which our tests expect to fail.


    Notes about links and symlinks under Windows:

    • creating a symlink requires administrative privileges (SeCreateSymbolicLinkPrivilege). On Windows 10 this requirement may be removed, but this requires enabling developer mode
    • Windows distinguishes between file and directory symlinks
    • It's possible to create a dangling symlink, but the type (file/directory) has to be specified upon creation. The behavior in case of type mismatch is inconsistent. Precisely, suppose that a dangling file symlink is created foo -> bar and later, a directory bar is created. Then:
      • under msys64 bash, cd foo succeeds and the directory view is the same when access either directly or through the symlink
      • under cmd (both windowed and as a child process from msys64 bash). cd foo fails with The directory name is invalid
    wasi:impl wasi:tests wasi 
    opened by marmistrz 31
  • Implement lazy funcref table and anyfunc initialization.

    Implement lazy funcref table and anyfunc initialization.

    During instance initialization, we build two sorts of arrays eagerly:

    • We create an "anyfunc" (a VMCallerCheckedAnyfunc) for every function in an instance.

    • We initialize every element of a funcref table with an initializer to a pointer to one of these anyfuncs.

    Most instances will not touch (via call_indirect or table.get) all funcref table elements. And most anyfuncs will never be referenced, because most functions are never placed in tables or used with ref.func. Thus, both of these initialization tasks are quite wasteful. Profiling shows that a significant fraction of the remaining instance-initialization time after our other recent optimizations is going into these two tasks.

    This PR implements two basic ideas:

    • The anyfunc array can be lazily initialized as long as we retain the information needed to do so. A zero in the func-ptr part of the tuple means "uninitalized"; a null-check and slowpath does the initialization whenever we take a pointer to an anyfunc.

    • A funcref table can be lazily initialized as long as we retain a link to its corresponding instance and function index for each element. A zero in a table element means "uninitialized", and a slowpath does the initialization.

    The use of all-zeroes to mean "uninitialized" means that we can use fast memory clearing techniques, like madvise(DONTNEED) on Linux or just freshly-mmap'd anonymous memory, to get to the initial state without a lot of memory writes.

    Funcref tables are a little tricky because funcrefs can be null. We need to distinguish "element was initially non-null, but user stored explicit null later" from "element never touched" (ie the lazy init should not blow away an explicitly stored null). We solve this by stealing the LSB from every funcref (anyfunc pointer): when the LSB is set, the funcref is initialized and we don't hit the lazy-init slowpath. We insert the bit on storing to the table and mask it off after loading.

    Performance effect on instantiation in the on-demand allocator (pooling allocator effect should be similar as the table-init path is the same):

    sequential/default/spidermonkey.wasm
                            time:   [71.886 us 72.012 us 72.133 us]
    
    sequential/default/spidermonkey.wasm
                            time:   [22.243 us 22.256 us 22.270 us]
                            change: [-69.117% -69.060% -69.000%] (p = 0.00 < 0.05)
                            Performance has improved.
    

    So, 72ยตs to 22ยตs, or a 69% reduction.

    wasmtime:api cranelift cranelift:area:machinst cranelift:wasm cranelift:area:x64 
    opened by cfallin 28
  • Debug a wasm application with reasonable amount of RAM

    Debug a wasm application with reasonable amount of RAM

    I'm trying to run a large (~15 MB) wasm application that crashes on ud2. Is there any way to determine which function the crash address corresponds to?

    opened by whitequark 28
  • Support records, variants, enums, unions, and flags in the component model

    Support records, variants, enums, unions, and flags in the component model

    I'm splitting this issue out of https://github.com/bytecodealliance/wasmtime/issues/4185 to write up some thoughts on how this can be done. Specifically today the current Wasmtime support for the component model has mappings for many component model types to Rust native types but not all of them. For example integers, strings, lists, tuples, etc, are all mapped directly to Rust types. Basically if the component model types equivalent in Rust is in the Rust standard library that's already implemented. What that leaves to implement, however, is Rust-defined mappings for component model types that are "structural" like records.

    This issue is intended to document the current thinking of how we're going to expose this. The general idea is that we'll create a proc-macro crate, probably named something like wasmtime-component-macro, which is an internal dependency of the wasmtime crate. The various macros would then get reexported at the wasmtime::component::* namespace.

    Currently the bindings for host types are navigated through three traits: ComponentValue, Lift, and Lower. We'll want a custom derive for all three of these traits. Deriving Lift and Lower require a ComponentValue derive as well, but users should be able to pick one of Lift and Lower without the other one.

    record

    Records in the component model correspond to structs in Rust. The rough shape of this will be:

    use wasmtime::component::{ComponentValue, Lift, Lower};
    
    #[derive(ComponentValue, Lift, Lower)]
    #[component(record)]
    struct Foo {
        #[component(name = "foo-bar-baz")]
        a: i32,
        b: u32,
    }
    

    To typecheck correctly the record type must list fields in the same order as the fields listed in the Rust code for now. Field reordering may be implemented at a later date but for now we'll do strict matching. Fields must have both matching names and matching types.

    The #[component(record)] here may seem redundant but it's somewhat required below for variants/enums.

    The #[component(name = "...")] is intended to rename the field from the component model's perspective. The type-checking will test against the name specified.

    Using this derive on a tuple or empty struct will result in a compile-time error.

    variant

    Variants roughly correspond to Rust enums:

    use wasmtime::component::{ComponentValue, Lift, Lower};
    
    #[derive(ComponentValue, Lift, Lower)]
    #[component(variant)]
    enum Foo {
        #[component(name = "foo-bar-baz")]
        A(u32),
        B,
    }
    

    Typechecking, like records, will check cases in-order and all cases must match in both name and payload. A missing payload in Rust is automatically interpreted as the unit payload in the component model.

    Variants with named fields (B { bar: u32 }) will be disallowed. Variants with multiple payloads (B(u32, u32)) will also be disallowed.

    Note that #[component(variant)] here distinguishes it from...

    enum

    use wasmtime::component::{ComponentValue, Lift, Lower};
    
    #[derive(ComponentValue, Lift, Lower)]
    #[component(enum)]
    enum Foo {
        #[component(name = "foo-bar-baz")]
        A,
        B,
    }
    

    Typechecking is similar to variants where the number/names of cases must all match.

    Variants with any payload are disallowed in this derive mode.

    union

    This will, perhaps surprisingly, still map to an enum in Rust since this is still a tagged union, not a literal C union:

    use wasmtime::component::{ComponentValue, Lift, Lower};
    
    #[derive(ComponentValue, Lift, Lower)]
    #[component(union)]
    enum Foo {
        A(u32),
        B(f32),
    }
    

    The number of cases and the types of each case must match a union definition to correctly typecheck. Union cases don't have names so renaming here isn't needed.

    A payload on each enum case in Rust is required, and like with variant it's required to be a tuple-variant with only one element. All other forms of payloads are disallowed. Note that the names in Rust are just informative in Rust, it doesn't affect the ABI or type-checking

    flags

    These will be a bit "funkier" than the above since there's not something obvious to attach a #[derive] to:

    wasmtime::component::flags! {
        #[derive(Lift, Lower)]
        flags Foo {
            #[component(name = "...")]
            const A;
            const B;
            const C;
        }
    }
    

    The general idea here is to roughly take inspiration from the bitflags crate in terms of what the generated code does. Ideally this should have a convenient Debug implementation along with various constants to OR-together and such in Rust. The exact syntax here is up for debate, this is just a strawman.

    Implementation Details

    One caveat is that the ComponentValue/Lift/Lower traits mention internal types in the wasmtime crate which aren't intended to be part of the public API. To solve this the macro will reference items in a path such as:

    wasmtime::component::__internal::the_name
    

    The __internal module will be #[doc(hidden)] and will only exist to reexport dependencies needed by the proc-macro. This crate may end up having a bland pub use wasmtime_environ or individual items, whatever works best.

    The actual generated trait impls will probably look very similar to the implementations that exist for tuples, and Result<T, E> already present in typed.rs

    Alternatives

    One alternative to the above is to have #[derive(ComponentRecord)] instead of #[derive(ComponentValue)] #[component(record)] or something like that. While historically some discussions have leaned in this direction with the introduction of Lift and Lower traits I personally feel that the balance is now slightly in the other direction where it would be nice if we can keep derive targeted at the specific traits and then configuration for the derive happens afterwards.

    wasm-proposal:component-model 
    opened by alexcrichton 27
  • wasi: avoid buffer underflow with shared memory

    wasi: avoid buffer underflow with shared memory

    This change fixes an issue identified when using wasi-threads to perform file reads. In order to maintain Rust safety guarantees in the presence of WebAssembly shared memory, which can be modified concurrently by any of the running threads, the WASI implementations of fd_read and fd_pread were given special code paths when shared memory is detected: in these cases, the data is first read into a host-limited buffer and then subsequently copied into linear memory. The problem was that the rather-complex logic for doing this "buffer then copy" idea for multiple IO vectors could fail due to buffer underflow. If, e.g., a read was limited by the host to 64K (or even if the read returned less than the total buffer size) the UnsafeGuestSlice::copy_from_slice logic would fail, complaining that the sizes of both buffers were unequal.

    This change both simplifies and fixes the logic:

    • only the first IO vector is filled; this could represent a performance penalty for threaded programs, but the "buffer then copy" idea already imposes a non-trivial overhead. This simplifies the logic, allowing us to...
    • resize the shared memory buffer to the exact number of bytes read
    wasi 
    opened by abrown 2
  • Reimplement Wasmtime's DWARF transform and debugging support

    Reimplement Wasmtime's DWARF transform and debugging support

    We should reimplement the DWARFwasm to DWARFnative transformation pass that implements the GDB/LLDB debugging support in Wasmtime by separating DWARF translation from DWARF traversal. We could do this by defining a generic DWARF transformation pass that takes a generic visitor implementation, walks the read-only input DWARF, calls the corresponding visitor method for each DIE/attribute/value/line-table entry/etc... in the DWARF to produce a new DWARF entity, and writes that new DWARF entity into the output DWARF that is being built up. We would then implement a DWARFwasm to DWARFnative visitor.

    I think this approach would be much easier to implement, maintain, and ensure correctness of than our current open-coded transformation.

    Assuming this interface works out well and we prove it out, it could be worth upstreaming the generic transformation pass and visitor trait into gimli itself (cc @philipc).

    Potential hiccups could be that, for our purposes here, the visitor might not be exactly a simple map over the input DWARF (or "functor-ish") in that one DWARFwasm entity might become multiple DWARFnative entities (making it more "monad-ish", apologies if I'm just muddying the waters with this nomenclature). One example is that what might be a location list entry in Wasm could become multiple location list entries in native code due to register allocation, live range splitting, and spilling.

    Testing

    Our testing story for debugging support is very poor at the moment and the debugging support is correspondingly buggy. As part of this reimplementation, we should take the opportunity to improve our approach to testing.

    I think we can do something like this, in a loop:

    • generate a random C program with C-Smith
    • compile the program twice:
      1. to wasm32-wasi
      2. to the host target
    • attach gdb and/or lldb to
      1. wasmtime running the wasm version
      2. the native binary
    • single step N times (or until main exits) and at each point assert that:
      • the native and wasm programs are paused at the same location
      • the same variables are in scope
      • the variables in scope have the same values (at least for non-pointer scalars, we can tune the C-Smith flags we use to generate test programs as necessary)

    I think this should give us fairly high confidence in the correctness of the new DWARF transform.

    Unfortunately, this won't fit into OSS-Fuzz's paradigm super well. It involves a lot of wrangling external processes. I think we can do N iterations under normal cargo test with a fixed corpus of seeds, so that running cargo test twice runs the same set of test programs each time. And then in CI perhaps we can have a job that runs more iterations, or a nightly CI job that does a bunch of iterations, or something like that. To some degree, we can kick this can down the road and figure things out once we have the test infrastructure set up (even just running it manually whenever we touch this code would be a huge improvement over our current debugging testing strategy).

    cc @cfallin as this is something we have talked about together in the past.

    wasmtime:debugging 
    opened by fitzgen 0
  • cranelift: Forbid argument extensions for floats and SIMD vectors

    cranelift: Forbid argument extensions for floats and SIMD vectors

    ๐Ÿ‘‹ Hey,

    This fixes #5531 by adding a verifier check and changing fuzzgen to not produce this type of code.

    The check forbids argument extensions for anything that is not a scalar integer.

    cranelift 
    opened by afonso360 0
  • Cranelift: Compile failure with argument extensions on floating point values on riscv64

    Cranelift: Compile failure with argument extensions on floating point values on riscv64

    ๐Ÿ‘‹ Hey,

    It looks like both riscv64 and s390x fail to compile a call to a function with a sext or uext argument extension on a floating point argument.

    I don't even know what those are supposed to do for floating point arguments, but they work on x86_64 and AArch64.

    .clif Test Case

    test compile
    target riscv64
    target s390x
    
    function %f4(f32) -> f32 {
        fn0 = %g(f32 sext) -> f32
    
    block0(v0: f32):
        v1 = call fn0(v0)
        return v1
    }
    

    Steps to Reproduce

    • clif-util test ./the-above.clif

    Expected Results

    The test to compile

    Actual Results

    thread 'worker #1' panicked at 'assertion failed: `(left == right)`
      left: `Int`,
     right: `Float`', cranelift/codegen/src/machinst/abi.rs:2223:33
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
     ERROR cranelift_filetests::concurrent > FAIL: panicked in worker #1: assertion failed: `(left == right)`
      left: `Int`,
     right: `Float`
    FAIL ./lmao.clif: panicked in worker #1: assertion failed: `(left == right)`
      left: `Int`,
     right: `Float`
    1 tests
    Error: 1 failure
    

    Versions and Environment

    Cranelift version or commit: main Operating system: Linux Architecture: riscv64

    bug cranelift cranelift:area:riscv64 
    opened by afonso360 2
  • Cranelift: SIMD `icmp` unsigned comparisions for `i64x2` are not implemented on x86_64

    Cranelift: SIMD `icmp` unsigned comparisions for `i64x2` are not implemented on x86_64

    ๐Ÿ‘‹ Hey,

    It looks like we have a hole in our lowerings where all unsigned operator are unimplemented only for i64x2 types. (they work for i32x4/i16x8/i8x16).

    .clif Test Case

    test run
    target x86_64
    
    function %simd_icmp_uge_i64(i64x2, i64x2) -> i64x2 {
    block0(v0: i64x2, v1: i64x2):
        v2 = icmp uge v0, v1
        return v2
    }
    ; run: %simd_icmp_uge_i64([0 1], [0 0]) == [-1 -1]
    ; run: %simd_icmp_uge_i64([-1 0], [-1 1]) == [-1 0]
    ; run: %simd_icmp_uge_i64([-5 1], [-1 -1]) == [0 0]
    ; run: %simd_icmp_uge_i64([0 0], [0 0]) == [-1 -1]
    
    
    function %simd_icmp_ugt_i64(i64x2, i64x2) -> i64x2 {
    block0(v0: i64x2, v1: i64x2):
        v2 = icmp ugt v0, v1
        return v2
    }
    ; run: %simd_icmp_ugt_i64([0 1], [0 0]) == [0 -1]
    ; run: %simd_icmp_ugt_i64([-1 0], [-1 1]) == [0 0]
    ; run: %simd_icmp_ugt_i64([-5 1], [-1 -1]) == [0 0]
    ; run: %simd_icmp_ugt_i64([0 0], [0 0]) == [0 0]
    
    function %simd_icmp_ule_i64(i64x2, i64x2) -> i64x2 {
    block0(v0: i64x2, v1: i64x2):
        v2 = icmp ule v0, v1
        return v2
    }
    ; run: %simd_icmp_ule_i64([0 1], [0 0]) == [-1 0]
    ; run: %simd_icmp_ule_i64([-1 0], [-1 1]) == [-1 -1]
    ; run: %simd_icmp_ule_i64([-5 1], [-1 -1]) == [-1 -1]
    ; run: %simd_icmp_ule_i64([0 0], [0 0]) == [-1 -1]
    
    function %simd_icmp_ult_i64(i64x2, i64x2) -> i64x2 {
    block0(v0: i64x2, v1: i64x2):
        v2 = icmp ult v0, v1
        return v2
    }
    ; run: %simd_icmp_ult_i64([0 1], [0 0]) == [0 0]
    ; run: %simd_icmp_ult_i64([-1 0], [-1 1]) == [0 -1]
    ; run: %simd_icmp_ult_i64([-5 1], [-1 -1]) == [-1 -1]
    ; run: %simd_icmp_ult_i64([0 0], [0 0]) == [0 0]
    

    Steps to Reproduce

    • clif-util test ./the-above.clif

    Expected Results

    The tests to pass

    Actual Results

    thread 'worker #1' panicked at 'internal error: entered unreachable code: no rule matched for term x64_pmaxu at src/isa/x64/inst.isle line 2495; should it be partial?', /home/afonso/git/wasmtime/target/debug/build/cranelift-codegen-cafe59ca8460988e/out/isle_x64.rs:4755:12
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
     ERROR cranelift_filetests::concurrent > FAIL: panicked in worker #1: internal error: entered unreachable code: no rule matched for term x64_pmaxu at src/isa/x64/inst.isle line 2495; should it be partial?
    FAIL ./lmao.clif: panicked in worker #1: internal error: entered unreachable code: no rule matched for term x64_pmaxu at src/isa/x64/inst.isle line 2495; should it be partial?
    1 tests
    Error: 1 failure
    

    Versions and Environment

    Cranelift version or commit: Main Operating system: Linux Architecture: x86_64

    Extra Info

    These tests already exist in our test suite.

    bug cranelift 
    opened by afonso360 0
Releases(dev)
Owner
Bytecode Alliance
Bytecode Alliance
A (very experimental) WebAssembly backend for Cranelift.

cranelift_codegen_wasm Experimental code generation for WebAssembly from Cranelift IR. note: not ready for usage yet Setup Contains an item called Was

Teymour Aldridge 8 Dec 5, 2022
A standalone Forth interpreter/compiler for WebAssembly.

ForSM A standalone Forth interpreter/compiler for WebAssembly. Bootstrapped from a Rust program, but the ultimate goal for it is to be self-hosting. A

Simon Gellis 5 Jun 15, 2022
๐Ÿš€Wasmer is a fast and secure WebAssembly runtime that enables super lightweight containers to run anywhere

Wasmer is a fast and secure WebAssembly runtime that enables super lightweight containers to run anywhere: from Desktop to the Cloud, Edge and IoT devices.

Wasmer 14.1k Jan 8, 2023
WebAssembly to Lua translator, with runtime

This is a WIP (read: absolutely not ready for serious work) tool for translating WebAssembly into Lua. Support is specifically for LuaJIT, with the se

null 43 Dec 31, 2022
Lunatic is an Erlang-inspired runtime for WebAssembly

Lunatic is a universal runtime for fast, robust and scalable server-side applications. It's inspired by Erlang and can be used from any language that

Lunatic 3.7k Jan 9, 2023
A prototype WebAssembly linker using module linking.

WebAssembly Module Linker Please note: this is an experimental project. wasmlink is a prototype WebAssembly module linker that can link together a mod

Peter Huene 19 Oct 28, 2022
Zaplib is an open-source library for speeding up web applications using Rust and WebAssembly.

โšก Zaplib Zaplib is an open-source library for speeding up web applications using Rust and WebAssembly. It lets you write high-performance code in Rust

Zaplib 1.2k Jan 5, 2023
A template for kick starting a Rust and WebAssembly project using wasm-pack.

A template for kick starting a Rust and WebAssembly project using wasm-pack.

Haoxi Tan 1 Feb 14, 2022
Client for integrating private analytics in fast and reliable libraries and apps using Rust and WebAssembly

TelemetryDeck Client Client for integrating private analytics in fast and reliable libraries and apps using Rust and WebAssembly The library provides

Konstantin 2 Apr 20, 2022
Lumen - A new compiler and runtime for BEAM languages

An alternative BEAM implementation, designed for WebAssembly

Lumen 3.1k Dec 26, 2022
A high-performance, secure, extensible, and OCI-complaint JavaScript runtime for WasmEdge.

Run JavaScript in WebAssembly Now supporting wasmedge socket for HTTP requests and Tensorflow in JavaScript programs! Prerequisites Install Rust and w

Second State 219 Jan 3, 2023
Wasm runtime written in Rust

Wasm runtime written in Rust

Teppei Fukuda 1 Oct 29, 2021
Sealed boxes implementation for Rust/WebAssembly.

Sealed boxes for Rust/WebAssembly This Rust crate provides libsodium sealed boxes for WebAssembly. Usage: // Recipient: create a new key pair let reci

Frank Denis 16 Aug 28, 2022
WebAssembly on Rust is a bright future in making application runs at the Edge or on the Serverless technologies.

WebAssembly Tour WebAssembly on Rust is a bright future in making application runs at the Edge or on the Serverless technologies. We spend a lot of ti

Thang Chung 129 Dec 28, 2022
WebAssembly modules that use Azure services

This is an experimental repository containing WebAssembly modules running on top of WAGI (WebAssembly Gateway Interface, which allows you to run WebAssembly WASI binaries as HTTP handlers) and using Azure services.

null 7 Apr 18, 2022
WebAssembly Service Porter

WebAssembly Service Porter.

henrylee2cn 12 Dec 12, 2022
WAGI: WebAssembly Gateway Interface

Write HTTP handlers in WebAssembly with a minimal amount of work

null 724 Jan 6, 2023
A console and web-based Gomoku written in Rust and WebAssembly

?? rust-gomoku A console and web-based Gomoku written in Rust and WebAssembly Getting started with cargo & npm Install required program, run # install

namkyu1999 2 Jan 4, 2022
WebAssembly development with Trunk & Vite.js

Trunk & Vite.js Demo Trunk is a WASM web application bundler for Rust, and Vite.js is next Generation Frontend Tooling. Ok, they are together now for

Libing Chen 6 Nov 24, 2021