Wonnx - a GPU-accelerated ONNX inference run-time written 100% in Rust, ready for the web

WebONNX

Last update: Jan 6, 2023

Related tags

Overview

Wonnx is a GPU-accelerated ONNX inference run-time written 100% in Rust, ready for the web.

Supported Platforms (enabled by `wgpu`)

API	Windows	Linux & Android	macOS & iOS
Vulkan	✅	✅
Metal			✅
DX12	✅ (W10 only)
DX11	🚧
GLES3		🆗

✅ = First Class Support — 🆗 = Best Effort Support — 🚧 = Unsupported, but support in progress

Getting started

Install Rust
Install Vulkan, Metal, or DX12 for the GPU API.
Ensure Git LFS is installed
git clone this repo.

git clone https://github.com/webonnx/wonnx.git
git lfs install

From the command line

Ensure Git LFS is initialized and has downloaded the model files (in wonnx/examples/data/models). Then, you're all set! You can run an example:

cargo run --example squeeze --release

Or you can try the CLI (see the README for more information):

cargo run --release -- info ./data/models/opt-squeeze.mnist
cargo run --release -- infer ./data/models/opt-squeeze.onnx -i data=./data/images/pelican.jpeg --labels ./data/models/squeeze-labels.txt --top 3

From Python

pip install wonnx

And then:

from wonnx import PySession
session = PySession.from_path(
    "../data/models/single_relu.onnx"
)
inputs = {"x": [-1.0, 2.0]}
assert session.run(inputs) == {"y": [0.0, 2.0]}

To build the Python module for development:

cd wonnx-py
python3 -m venv .env
source .env/bin/activate
pip install maturin
maturin develop

Then run python3 with the above Python code!

Running a model from scratch

To run an onnx model, first simplify it with onnx-simplifier, with the command:

# pip install -U pip && pip install onnx-simplifier
python -m onnxsim mnist-8.onnx opt-mnist.onnx

Then you can run it following the example in the examples folder:

cargo run --example mnist --release

>(); probabilities.sort_unstable_by(|a, b| b.1.partial_cmp(a.1).unwrap()); assert_eq!(probabilities[0].0, 22); }">

fn main() -> HashMap<String, Vec<f32>> {
    let mut input_data = HashMap::new();
    let image = load_squeezenet_image(); // Load image
    input_data.insert("data".to_string(), InputTensor::F32(image.as_slice().unwrap()));

    let session = pollster::block_on(wonnx::Session::from_path(
        "examples/data/models/opt-squeeze.onnx",
    ))
    .expect("session did not create");
    let result = pollster::block_on(session.run(input_data)).unwrap();
    let result = result["squeezenet0_flatten0_reshape0"];
    let mut probabilities = result.iter().enumerate().collect::<Vec<_>>();

    probabilities.sort_unstable_by(|a, b| b.1.partial_cmp(a.1).unwrap());

    assert_eq!(probabilities[0].0, 22);
}

Examples are available in the examples folder

Tested models

Squeezenet
MNIST

GPU selection

You may set the following environment variables to influence GPU selection by WGPU:

WGPU_ADAPTER_NAME with a substring of the name of the adapter you want to use (e.g. 1080 will match NVIDIA GeForce 1080ti).
WGPU_BACKEND with a comma separated list of the backends you want to use (vulkan, metal, dx12, dx11, or gl).
WGPU_POWER_PREFERENCE with the power preference to choose when a specific adapter name isn't specified (high or low)

Contribution: On implementing a new Operator

Contribution are very much welcomed even without large experience in DL, WGSL, or Rust. I hope that, this project can be a sandbox for all of us to learn more about those technologies beyond this project initial scope.

To implement an operator all you have to do is:

Add a new matching pattern in compiler.rs
Retrieve its attributes values using the get_attribute function:

("alpha", None, node);">

    let alpha = get_attribute("alpha", Some(1.0), node);
    // or without default value
    let alpha = get_attribute::<f32>("alpha", None, node);

Add any variable you want to use in the WGSL shader using context.
Write a new WGSL template in the templates folder.

Available types are in structs.wgsl but you can also generate new ones within your templates.

Respect the binding layout that each entry is incremented by 1 starting from 0, with input first and output last. If the number of binding is above 4. Increment the binding group. You can change the input within sequencer.rs
Write the logic.

There is default variables in the context:

{{ i_lens[0] }}: the length of the input 0. This also work for output: {{ o_lens[0] }} and other input {{ i_lens[1] }}
{{ i_shape[0] }}: the array of dimensions of input 0. To get the first dimension of the array, just use: {{ i_shape[0][0] }}
{{ i_chunks[0] }}: the size of the chunks of each dimensions of input 0. By default, each variable is represented as a long array of values where to get to specific values you have to move by chunks. Those chunks are represented within this variable. To get the size of the chunks of the first dimensions use: {{ i_chunks[0][0] }}.
{{ op_type }} the op type as some op_type like activation are using the same template.

Test it using the utils function and place it in the tests folder. The test can look as follows:

#[test]
fn test_matmul_square_matrix() {
    // USER INPUT

    let n = 16;
    let mut input_data = HashMap::new();

    let data_a = ndarray::Array2::eye(n);
    let mut data_b = ndarray::Array2::<f32>::zeros((n, n));
    data_b[[0, 0]] = 0.2;
    data_b[[0, 1]] = 0.5;

    let sum = data_a.dot(&data_b);

    input_data.insert("A".to_string(), data_a.as_slice().unwrap());
    input_data.insert("B".to_string(), data_b.as_slice().unwrap());

    let n = n as i64;
    let model = model(graph(
        vec![tensor("A", &[n, n]), tensor("B", &[n, n])],
        vec![tensor("C", &[n, n])],
        vec![],
        vec![],
        vec![node(vec!["A", "B"], vec!["C"], "MatMul", "MatMul", vec![])],
    ));

    let session =
        pollster::block_on(wonnx::Session::from_model(model)).expect("Session did not create");

    let result = pollster::block_on(session.run(input_data)).unwrap();

    assert_eq!(result["C"].as_slice(), sum.as_slice().unwrap());
}

Check out tera documentation for other templating operation: https://tera.netlify.app/docs/

If at any point you want to do optimisation of several node you can do it within sequencer.rs.

Supported Operators (ref ONNX IR)

Operator	Since version	Implemented
Abs	13, 6, 1	✅
Acos	7	✅
Acosh	9
Add	14, 13, 7, 6, 1	✅
And	7, 1	✅
ArgMax	13, 12, 11, 1
ArgMin	13, 12, 11, 1
Asin	7	✅
Asinh	9
Atan	7	✅
Atanh	9
AveragePool	11, 10, 7, 1	✅
BatchNormalization	15, 14, 9, 7, 6, 1	✅
BitShift	11
Cast	13, 9, 6, 1	✅
Ceil	13, 6, 1	✅
Clip	13, 12, 11, 6, 1	✅
Compress	11, 9
Concat	13, 11, 4, 1	✅
ConcatFromSequence	11
Constant	13, 12, 11, 9, 1
ConstantOfShape	9
Conv	11, 1	✅
ConvInteger	10
ConvTranspose	11, 1
Cos	7	✅
Cosh	9	✅
CumSum	14, 11
DepthToSpace	13, 11, 1
DequantizeLinear	13, 10
Det	11
Div	14, 13, 7, 6, 1	✅
Dropout	13, 12, 10, 7, 6, 1	✅
Einsum	12
Elu	6, 1	✅
Equal	13, 11, 7, 1	✅
Erf	13, 9
Exp	13, 6, 1	✅
Expand	13, 8
EyeLike	9
Flatten	13, 11, 9, 1	✅
Floor	13, 6, 1	✅
GRU	14, 7, 3, 1
Gather	13, 11, 1	✅ (axis=0)
GatherElements	13, 11
GatherND	13, 12, 11
Gemm	13, 11, 9, 7, 6, 1	✅
GlobalAveragePool	1	✅
GlobalLpPool	2, 1
GlobalMaxPool	1
Greater	13, 9, 7, 1	✅
GridSample	16
HardSigmoid	6, 1
Hardmax	13, 11, 1
Identity	16, 14, 13, 1	✅
If	16, 13, 11, 1
InstanceNormalization	6, 1
IsInf	10
IsNaN	13, 9
LRN	13, 1
LSTM	14, 7, 1
LeakyRelu	6, 1	✅
Less	13, 9, 7, 1	✅
Log	13, 6, 1	✅
Loop	16, 13, 11, 1
LpNormalization	1
LpPool	11, 2, 1
MatMul	13, 9, 1	✅
MatMulInteger	10
Max	13, 12, 8, 6, 1
MaxPool	12, 11, 10, 8, 1	✅
MaxRoiPool	1
MaxUnpool	11, 9
Mean	13, 8, 6, 1
Min	13, 12, 8, 6, 1	✅
Mod	13, 10	✅
Mul	14, 13, 7, 6, 1	✅
Multinomial	7
Neg	13, 6, 1
NonMaxSuppression	11, 10
NonZero	13, 9
Not	1
OneHot	11, 9
Optional	15
OptionalGetElement	15
OptionalHasElement	15
Or	7, 1	✅
PRelu	9, 7, 6, 1
Pad	13, 11, 2, 1
Pow	15, 13, 12, 7, 1
QLinearConv	10
QLinearMatMul	10
QuantizeLinear	13, 10
RNN	14, 7, 1
RandomNormal	1
RandomNormalLike	1
RandomUniform	1
RandomUniformLike	1
Reciprocal	13, 6, 1	✅
ReduceL1	13, 11, 1
ReduceL2	13, 11, 1
ReduceLogSum	13, 11, 1
ReduceLogSumExp	13, 11, 1
ReduceMax	13, 12, 11, 1
ReduceMean	13, 11, 1
ReduceMin	13, 12, 11, 1
ReduceProd	13, 11, 1
ReduceSum	13, 11, 1
ReduceSumSquare	13, 11, 1
Relu	14, 13, 6, 1	✅
Reshape	14, 13, 5, 1	✅
Resize	13, 11, 10	✅
ReverseSequence	10
RoiAlign	16, 10
Round	11
Scan	11, 9, 8
Scatter (deprecated)	11, 9
ScatterElements	16, 13, 11
ScatterND	16, 13, 11
Selu	6, 1
SequenceAt	11
SequenceConstruct	11
SequenceEmpty	11
SequenceErase	11
SequenceInsert	11
SequenceLength	11
Shape	15, 13, 1
Shrink	9
Sigmoid	13, 6, 1	✅
Sign	13, 9
Sin	7	✅
Sinh	9	✅
Size	13, 1
Slice	13, 11, 10, 1
Softplus	1	✅
Softsign	1	✅
SpaceToDepth	13, 1
Split	13, 11, 2, 1
SplitToSequence	11
Sqrt	13, 6, 1	✅
Squeeze	13, 11, 1	✅
StringNormalizer	10
Sub	14, 13, 7, 6, 1	✅
Sum	13, 8, 6, 1
Tan	7	✅
Tanh	13, 6, 1	✅
TfIdfVectorizer	9
ThresholdedRelu	10
Tile	13, 6, 1
TopK	11, 10, 1
Transpose	13, 1	✅
Trilu	14
Unique	11
Unsqueeze	13, 11, 1	✅
Upsample (deprecated)	10, 9, 7
Where	16, 9
Xor	7, 1
Function	Since version
Bernoulli	15
CastLike	15
Celu	12	✅
DynamicQuantizeLinear	11
GreaterOrEqual	12	✅
HardSwish	14
LessOrEqual	12	✅
LogSoftmax	13, 11, 1
MeanVarianceNormalization	13, 9
NegativeLogLikelihoodLoss	13, 12
Range	11
Softmax	13, 11, 1	✅
SoftmaxCrossEntropyLoss	13, 12

Comments

Make wonnx more robust by using a DAG for optimization and transformation

@haixuanTao this is a work-in-progress for adding an 'intermediate representation' - i.e. a tree of nodes that we can more easily perform operations on (such as optimizations). It can also be used to fixed the 'execution order' thing (see #44).

opened by pixelspark 22

Conv with bias is not correct

After several days of debugging I think I finally get why wonnx is giving incorrect results. It seems that conv is not calculating bias correctly.

Test:


#[test]
fn conv_bias() {
    let n = 5;
    let c = 1;
    let mut input_data = HashMap::new();

    let data: Vec<f32> = (0..25).map(|x| x as f32).collect();
    let shape = vec![1, c as i64, n as i64, n as i64];
    input_data.insert("X".to_string(), data.as_slice().into());

    let kernel_n = 3;
    let m = 1;
    let data_w: Vec<f32> = (0..18).map(|_| 1.0f32).collect();
    let data_b = vec![0.0, 0.0];
    let conv_model = model(graph(
        vec![tensor("X", &shape)],
        vec![tensor("Y", &[1, 2, 5, 5])],
        vec![tensor("W", &[2, c, 3, 3])], // tensor("B", &[2])],
        vec![initializer("W", data_w)], // initializer("B", data_b)],
        vec![node(
            vec!["X", "W"], // "B"],
            vec!["Y"],
            "conv",
            "Conv",
            vec![
                attribute("kernel_shape", vec![3, 3]),
                attribute("strides", vec![1, 1]),
                attribute("pads", vec![1, 1, 1, 1]),
            ],
        )],
    ));

    let session =
        pollster::block_on(wonnx::Session::from_model(conv_model)).expect("Session did not create");
    let mut result = pollster::block_on(session.run(&input_data)).unwrap();
    assert_eq!(
        Vec::<f32>::try_from(result.remove("Y").unwrap()).unwrap(),
        &[
            12.0, 21.0, 27.0, 33.0, 24.0, 33.0, 54.0, 63.0, 72.0, 51.0, 63.0, 99.0, 108.0, 117.0,
            81.0, 93.0, 144.0, 153.0, 162.0, 111.0, 72.0, 111.0, 117.0, 123.0, 84.0, 12.0, 21.0,
            27.0, 33.0, 24.0, 33.0, 54.0, 63.0, 72.0, 51.0, 63.0, 99.0, 108.0, 117.0, 81.0, 93.0,
            144.0, 153.0, 162.0, 111.0, 72.0, 111.0, 117.0, 123.0, 84.0
        ],
    )
}

Now remove the commented out param of tensor B, the result should just be the same(as bias is zero). Instead wonnx is giving wrong results.

opened by zimond 13

creating parent workspace
Creating a new Workspace to hold wonnx and py-wonnx.

The files to watch for changes are:

The new Cargo.toml at the root

A new version of Rust within the CI as the previous version could not compile py-wonnx

a limitation of cargo check within the CI to the wonnx crate,

Small changes within the Cargo.toml to account for the parent Cargo.toml

That is it.

We can merge the #48 first.
opened by haixuanTao 13
Gemm/MatMul improvements: support small matrices, stacking and bias broadcasting

First attempt at fixing #94.

This one does not work just right yet however because it is still using Mat4x4 for multiplication, which should probably Mat2x2 (when i_shape[1][1] == 2) or even Mat3x3 or Scalar... leading to intermittent blocks of NaNs all over the place in the output:

@haixuanTao can you have a look at this, perhaps give me some pointers? I am not too familiar with the Gemm implementation.

opened by pixelspark 11
Logo in README Not Best For GitHub Dark Theme

Hey there, I just noticed that the logo doesn't look quite right on the GitHub dark theme.

Not sure what the best path forward is, but jus thought I'd let you know in case you don't notice it because you're on the light theme.

Neat looking project BTW! :+1:

opened by zicklag 10
Add a CLI utility
Some other things this PR does:

Restructure the workspace such that models/images are at the top level and more easily usable from e.g. Python

Add the squeezenet label set (this allows removing the dev-dependency on ureq, which pulls in a lot of other stuff, as well as the ONNX model zoo on S3).

Make the Python module a 'non default' member of the workspace, so you can cargo build the workspace and not be bothered by errors from the Python module's build process (which only really works with maturin I guess).

Best to merge this after #48 has landed by the way!
opened by pixelspark 9
docs: add docs.rs badge to readme

When looking at a crate's README, I usually look for the docs.rs badge at the top, and assume that it doesn't have docs otherwise. I thought that was initially the case with wonnx, but then I noticed the link to the docs later in the readme. Figured I'd add the badge to stop others from making the same assumption I did 👌

opened by philpax 7
Training

Is your feature request related to a problem? Please describe. I am trying to develop a ml project which is supposed to run on Rust. In order to allow model portability, onnx was chosen.

Does wonnx plan to allow training in the future? This would be very useful because otherwise, people may have to rewrite or interface their rust code in python to allow training with python based frameworks.

Describe the solution you'd like A way to formulate and train the onnx model and save it.

Describe alternatives you've considered Tract and onnxruntime-rs were identified as the main contenders. However, tract was meant for embedded devices and was not GPU-accelerated, onnxruntime-rs would not build and did not support the latest version of onnxruntime. Both didn't support training.

Additional context

opened by rhobro 7

ONNX Backend Test does not pass for Softmax and Pow

It seems that the recently merged Pow and Softmax does not pass ONNX Backend Test on Python.

FAILED tests/test_onnx_backend.py::OnnxBackendNodeModelTest::test_pow_cpu - pyo3_runtime.PanicException: called `Result::unwrap()` on an...
FAILED tests/test_onnx_backend.py::OnnxBackendNodeModelTest::test_pow_example_cpu - pyo3_runtime.PanicException: called `Result::unwrap(...
FAILED tests/test_onnx_backend.py::OnnxBackendNodeModelTest::test_softmax_axis_0_cpu - pyo3_runtime.PanicException: called `Result::unwr...
FAILED tests/test_onnx_backend.py::OnnxBackendNodeModelTest::test_softmax_axis_1_cpu - AssertionError: 
FAILED tests/test_onnx_backend.py::OnnxBackendNodeModelTest::test_softmax_axis_2_cpu - pyo3_runtime.PanicException: called `Result::unwr...
FAILED tests/test_onnx_backend.py::OnnxBackendNodeModelTest::test_softmax_default_axis_cpu - pyo3_runtime.PanicException: called `Result...
FAILED tests/test_onnx_backend.py::OnnxBackendNodeModelTest::test_softmax_large_number_cpu - AssertionError: 
FAILED tests/test_onnx_backend.py::OnnxBackendNodeModelTest::test_softmax_negative_axis_cpu - pyo3_runtime.PanicException: called `Resul...

opened by haixuanTao 7

Implement Reduce{Mean,Sum,Min,Max,Prod}

These are needed for BERTSQuAD at least.

(I can quite easily add the other Reduce* ops before merging this, e.g. ReduceL1, ReduceL2, ReduceSumSquare, ReduceLogSum, ReduceLogSumExp)

opened by pixelspark 7
Release a new version of the library, and the CLI tool to crates.io

Is your feature request related to a problem? Please describe.

To improve the 'tryability' of wonnx, users should be able to quickly do a cargo install wonnx-cli and run nnx infer .... This would then need to be added to the README as well.

Describe the solution you'd like

We should first release a new version of wonnx to crates.io after the CLI (#53) has merged.

Then we should release wonnx-cli as well (unfortunately we can't publish the workspace as single package and we don't want to merge the CLI in the wonnx package because it comes with all sorts of stuff that users of wonnx that just want the library don't need).

An issue is that we need to fix links to packages (e.g. wonnx-cli refers to wonnx using the path ../wonnx but for crates.io it should probably be a specific wonnx version, or a link to the Github repository. See also https://github.com/rust-lang/cargo/issues/6126).

Describe alternatives you've considered

We might want to consider providing binaries from the releases page on Github as well. If we have those, we can think about adding support for Homebrew.

Additional context

n/a

opened by pixelspark 6

Cannot install pip package

I tried to install the wonnx pip package, but the installation failed.

To Reproduce Run pip install wonnx.

Error

(wonnx) user@device ~ % pip install wonnx
Collecting wonnx
  Using cached wonnx-0.1.1.tar.gz (84 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      💥 maturin failed
        Caused by: Cargo metadata failed. Does your crate compile with `cargo build`?
        Caused by: `cargo metadata` exited with an error: error: multiple workspace roots found in the same workspace:
        /private/var/folders/vx/wqbngg455cd06qmc4gym99mw0000gp/T/pip-install-jsl0pkrb/wonnx_93d5408dc72a453daccf3439a157d63e
        /private/var/folders/vx/wqbngg455cd06qmc4gym99mw0000gp/T/pip-install-jsl0pkrb/wonnx_93d5408dc72a453daccf3439a157d63e/local_dependencies/wonnx
      Error running maturin: Command '['maturin', 'pep517', 'write-dist-info', '--metadata-directory', '/private/var/folders/vx/wqbngg455cd06qmc4gym99mw0000gp/T/pip-modern-metadata-jppvjhp_', '--interpreter', '/Users/user/miniconda3/envs/wonnx/bin/python']' returned non-zero exit status 1.
      Checking for Rust toolchain....
      Running `maturin pep517 write-dist-info --metadata-directory /private/var/folders/vx/wqbngg455cd06qmc4gym99mw0000gp/T/pip-modern-metadata-jppvjhp_ --interpreter /Users/user/miniconda3/envs/wonnx/bin/python`
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Desktop (please complete the following information):

OS: MacOS 12.5.1

opened by sluijs 2

Can't run any Model via WebGPU + WASM on Chrome
Description

I could not run even a single model via WebGPU WASM. I tried to run squeeze.html or single_relu or any other self simplified .onnx. I always get a lot of warnings while initializing the model. And when I run an inference, I'm getting the correct tensor shape, but all values with 0.

Tint WGSL reader failure: :11:8 error: invalid type for struct member

Reproduce

Following the description for the examples. Using latest chrome canary

Expected Behaviour

A correct Tensor after inferencing.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: macOS 12.6 (i7, MBP16, 2019, base model)

Browser Chrome Canary

Version 109.0.5388.0
opened by nebea 0
Implement `Shape` operator
Hello! I gave Shape operator a try. I pretty much got it working, however there are some small issues:

tests (run with env OP_TESTED=shape pytest tests/test_specific_op.py) almost pass, but they fail because of different types: it expects int64 while our output is float32. Not sure how to fix that (output array is i32 and I even tried setting scalar_type to I64, but this doesn't help)

shader is actually const, since everything is determined at shader compile time (at template render to be precise). this might be an issue (e.g. for models with dynamic inference size). but I don't know how this could be solved.

minor thing: because of the point above, we don't actually use input_0 in the shader code. but if we don't use it, then it gets removed and then bindings don't check out. I solved this by binding it to unused variable, but there could be a prettier way.

P.S. I'll cleanup the code once the other issues are solved ;]
opened by LoipesMas 3
Support Stable Diffusion model

Is your feature request related to a problem? Please describe. I would like to be able to run Stable Diffusion using wonnx

Describe the solution you'd like At least, these operators are missing and should be implemented before even trying too run Stable Diffusion on wonnx: Einsum, Erf, Expand, InstanceNormalization, Shape, Slice

This is the minimum based on this guide that simplifies the onnx model (see the simplification table): https://www.photoroom.com/tech/stable-diffusion-25-percent-faster-and-save-seconds/

Probably many more things will be needed, but I'm creating this issue because it can be a really interesting use case to be able to run SD in rust on the GPU directly.

I don't have much experience with wonnx or even ML, but I decided to create this issue because it surprised me how few operators are missing to run this model. I would need to get more experience with stable diffusion, diffusers library and onnx in python before attempting to port it here, but maybe there are more experienced users interested too.

opened by siriux 5

Can't run a single linear layer

Describe the bug I try to export a single linear layer from PyTorch and get one of the following errors. Error 1: GpuError(CompileError { node: "Gemm_0", error: InvalidInputShape { input_index: 1, input_shape: Shape { dims: [10, 784], data_type: F32 } } }) Error 2: IrError(OutputNodeNotFound("onnx::Add_4"))

I viewed the resulting onnx file at netron.app at it appeared to be correct.

To Reproduce

Run the following script

torch_model = torch.nn.Linear(784, 10)
model_input = torch.zeros((1, 784))    #This results in error 1. Changing shape to (784,) results in error 2
torch.onnx.export(torch_model,           # model being run
                  model_input,                      # model input (or a tuple for multiple inputs)
                  "onnx/model.onnx",           # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=11,             # the ONNX version to export the model to
                  do_constant_folding=True, # whether to execute constant folding for optimization
                  input_names = ['input'],     # the model's input names
                  output_names = ['output'], # the model's output names

Optionally run onnx-simplifier but it doesn't do anything on such a simple model.
Run the following rust program

fn main() {
    #[cfg(not(target_arch = "wasm32"))]
    {
        pollster::block_on(run());
    }
}

async fn run () {
    let model_path = Path::new("onnx/model.onnx");
    let _session = wonnx::Session::from_path(model_path).await.unwrap();
}

Expected behavior The model should load successfully.

Desktop PopOS 20.04

opened by Ryul0rd 6

IrError(Type(ParametrizedDimensionUnsupported("batch")))
Describe the bug Exporting a HuggingFace model using the recommended method results in the following error: thread 'main' panicked at 'called 'Result::unwrap()' on an 'Err' value: IrError(Type(ParametrizedDimensionUnsupported("batch")))' The inclusion of the batch dimension is not only what HuggingFace tool does but also what the official PyTorch docs recommend for exporting to onnx.

To Reproduce

pip install transformers[onnx]

python -m transformers.onnx --model=bert-base-uncased --feature=default onnx/

fn main() { #[cfg(not(target_arch = "wasm32"))] { pollster::block_on(run()); } } async fn run () { let model_path = Path::new("onnx/model.onnx"); let _session = wonnx::Session::from_path(model_path).await.unwrap(); }

Expected behavior The unwrap call should not encounter an error.

Desktop Linux PopOS 20.04
opened by Ryul0rd 6

Releases(v0.3.1)

v0.3.1(Nov 23, 2022)

This Pre-release will check for pip publishing.
Source code(tar.gz)
Source code(zip)
v0.3.0(Jul 31, 2022)

Previously published as v0.2.5. In deviance of SemVer conventions, cargo considers v0.2.4 and v0.2.5 compatible. To prevent users from experiencing unexpected breakage due to public API changes, v0.2.5 was yanked and re-published as v0.3.0 instead.
Source code(tar.gz)
Source code(zip)

Owner

WebONNX

Fast deep learning for the web on any device

GitHub

A Demo server serving Bert through ONNX with GPU written in Rust with <3

Demo BERT ONNX server written in rust This demo showcase the use of onnxruntime-rs on BERT with a GPU on CUDA 11 served by actix-web and tokenized wit

28 Jan 1, 2023

A fun, hackable, GPU-accelerated, neural network library in Rust, written by an idiot

Tensorken: A Fun, Hackable, GPU-Accelerated, Neural Network library in Rust, Written by an Idiot (work in progress) Understanding deep learning from t

44 May 6, 2023

A gpu accelerated (optional) neural network Rust crate.

Intricate A GPU accelerated library that creates/trains/runs neural networks in pure safe Rust code. Architechture overview Intricate has a layout ver

11 Dec 26, 2022

A library implementing GPU-accelerated cryptographic functionality for the zkSync prover.

zkSync Era: A ZK Rollup For Scaling Ethereum zkSync Era is a layer 2 rollup that uses zero-knowledge proofs to scale Ethereum without compromising on

3 Sep 24, 2023

`dfx new --type=rust` + burn-rs MNIST web inference example

ic-mnist The frontend provides a canvas where users can draw a digit. The drawn digit is then sent to the backend canister running burn-rs for inferen

4 Jun 25, 2023

Your one stop CLI for ONNX model analysis.

Your one stop CLI for ONNX model analysis. Featuring graph visualization, FLOP counts, memory metrics and more! ⚡️ Quick start First, download and ins

20 Dec 30, 2022

🔭 interactively explore `onnx` networks in your CLI.

nnli Interactively explore onnx networks in your CLI. Get nnli ?? From Cargo cargo install nnli From Github git clone https://github.com/drbh/nnli.git

18 Nov 27, 2023

How to: Run Rust code on your NVIDIA GPU

Status This documentation about an unstable feature is UNMAINTAINED and was written over a year ago. Things may have drastically changed since then; r

343 Dec 22, 2022

Damavand is a quantum circuit simulator. It can run on laptops or High Performance Computing architectures, such CPU distributed architectures or multi GPU distributed architectures.

Damavand is a code that simulates quantum circuits. In order to learn more about damavand, refer to the documentation. Development status Core feature