Simple WIP GPGPU framework for Rust built on top of wgpu

Jerónimo Sánchez

Last update: Dec 26, 2022

Overview

gpgpu

A simple GPU compute library based on wgpu. It is meant to be used alongside wgpu if desired.

To start using gpgpu, just create a Framework instance and follow the examples in the main repository.

Example

Small program that multiplies 2 vectors A and B; and stores the result in another vector C.

Rust program

 use gpgpu::*;

 fn main() -> GpuResult<()> {
    // Framework initialization
    let fw = Framework::default();

    // Original CPU data
    let cpu_data = (0..10000).into_iter().collect::<Vec<u32>>();

    // GPU buffer creation
    let buf_a = GpuBuffer::from_slice(&fw, &cpu_data);       // Input
    let buf_b = GpuBuffer::from_slice(&fw, &cpu_data);       // Input
    let buf_c = GpuBuffer::<u32>::new(&fw, cpu_data.len());  // Output

    // Shader load from SPIR-V binary file
    let shader = Shader::from_spirv_file(&fw, "<SPIR-V shader path>")?;
    //  or from a WGSL source file
    let shader = Shader::from_wgsl_file(&fw, "<WGSL shader path>")?;    

    // Descriptor set and program creation
    let desc = DescriptorSet::default()
        .bind_buffer(&buf_a, GpuBufferUsage::ReadOnly)
        .bind_buffer(&buf_b, GpuBufferUsage::ReadOnly)
        .bind_buffer(&buf_c, GpuBufferUsage::ReadWrite);
    let program = Program::new(&shader, "main").add_descriptor_set(desc); // Entry point

    // Kernel creation and enqueuing
    Kernel::new(&fw, program).enqueue(cpu_data.len() as u32, 1, 1); // Enqueuing, not very optimus 😅

    let output = buf_c.read()?;                        // Read back C from GPU
    for (a, b) in cpu_data.into_iter().zip(output) {
        assert_eq!(a.pow(2), b);
    }

    Ok(())
}

Shader program

The shader is writen in WGSL

// Vector type definition. Used for both input and output
[[block]]
struct Vector {
    data: [[stride(4)]] array<u32>;
};

// A, B and C vectors
[[group(0), binding(0)]] var<storage, read>  a: Vector;
[[group(0), binding(1)]] var<storage, read>  b: Vector;
[[group(0), binding(2)]] var<storage, read_write> c: Vector;

[[stage(compute), workgroup_size(1)]]
fn main([[builtin(global_invocation_id)]] global_id: vec3<u32>) {
    c.data[global_id.x] = a.data[global_id.x] * b.data[global_id.x];
}

You might also like...

A fast, safe and easy to use reinforcement learning framework in Rust.

RSRL (api) Reinforcement learning should be fast, safe and easy to use. Overview rsrl provides generic constructs for reinforcement learning (RL) expe

139 Dec 13, 2022

A Machine Learning Framework for High Performance written in Rust

polarlight polarlight is a machine learning framework for high performance written in Rust. Key Features TBA Quick Start TBA How To Contribute Contrib

25 Aug 23, 2022

Stable Diffusion v1.4 ported to Rust's burn framework

Stable-Diffusion-Burn Stable-Diffusion-Burn is a Rust-based project which ports the V1 stable diffusion model into the deep learning framework, Burn.

156 Aug 8, 2023

Stable Diffusion XL ported to Rust's burn framework

Stable-Diffusion-XL-Burn Stable-Diffusion-XL-Burn is a Rust-based project which ports stable diffusion xl into the Rust deep learning framework burn.

194 Sep 4, 2023

Open Machine Intelligence Framework for Hackers. (GPU/CPU)

Leaf • Introduction Leaf is a open Machine Learning Framework for hackers to build classical, deep or hybrid machine learning applications. It was ins

5.5k Jan 1, 2023

Xaynet represents an agnostic Federated Machine Learning framework to build privacy-preserving AI applications.

xaynet Xaynet: Train on the Edge with Federated Learning Want a framework that supports federated learning on the edge, in desktop browsers, integrate

196 Dec 22, 2022

Orkhon: ML Inference Framework and Server Runtime

Orkhon: ML Inference Framework and Server Runtime Latest Release License Build Status Downloads Gitter What is it? Orkhon is Rust framework for Machin

129 Dec 21, 2022

Tangram is an automated machine learning framework designed for programmers.

Tangram Tangram is an automated machine learning framework designed for programmers. Run tangram train to train a model from a CSV file on the command

1.4k Dec 30, 2022

zenoh-flow aims at providing a zenoh-based data-flow programming framework for computations that span from the cloud to the device.

Eclipse Zenoh-Flow Zenoh-Flow provides a zenoh-based dataflow programming framework for computations that span from the cloud to the device. ⚠️ This s

35 Dec 12, 2022

Comments

Read buffer into user-provided `Vec`

Is there a way to read data from GPU buffer into user-provided buffer? Looks like currently read() function returns Vec, thus allocating new memory every time. Same for Image buffers

opened by AdrianEddy 3
Add `Shader::from_wgsl_string`
Summary

Add Shader::from_wgsl_string, which is similar to Shader::from_spirv_bytes.

For my own purpose, I needed the functionality. Do you mind adding it to this repo?
opened by maekawatoshiki 1
Simplified `Kernel` creation and reuse
Added Shader struct (wrapper over wgpu::ShaderModule).

Added Program struct (Shader + entry point + bindings). It represents a function in a Shader.

Kernel is created from a Program. It is the execution of it.

DescriptorSet derives Clone.

Updated all examples to new setup.

Removed KernelBuilder: it added complexity.

Removed utils::shaders module. Functionality is now on Shader.

-----TODO----- % webcam example does not work. Device lost 🚩 % Update and check documentation. Update README as well.
opened by UpsettingBoy 1
Addition of parallel compute example
Check if the size must always be a multiple of the shader's workgroup size for each dimension or if it can be executed regardless.

Consedering opening a issue in wgpu
opened by UpsettingBoy 0

Releases(v0.2.0)

v0.2.0(Dec 23, 2021)
New gpgpu release! This revision includes, among other things, improvements of the API and the new integration with the ndarray crate.

✨New features

🚩 integrate-ndarray feature for integration with ndarray crate

Integration with ndarray its been added with #3. I'm no expert using this crate so any comments are welcome!

Upload ndarray arrays to the GPU using GpuArray objects. They save the dimensions of the array previous upload to the GPU.

Take a look at the ndarray example. It has some comments about its usage and problems.

⚡ Major changes

👩‍💻 API changes

This changes are focused on improve the usability of gpgpu, making more clear the asynchronous intent. The most important changes are:

Background polling: Polling functions cannot be used from gpgpu (can be used from wgpu). They are invoked every 10ms by default.

Read functions: All read function of GpuBuffer and friends have both async and blocking methods:

read() and read_blocking(): Now it needs into an user provided buffer (#8).

read_vec() and read_vec_blocking(): Reads into a non-user allocated vector (ol' way).

Write functions: All writes are instantly offloaded, this meaning that writes (updates to GPU) are non-blocking, but it progress cannot be checked.

More clear interaction with wgpu: gpgpu GPU objects can now be created from wgpu ones using from_gpu_parts() and converted back using into_gpu_parts (#5).

Kernel creation is simplified. It now has to be created from a Program which contains information about the shader, entry point and bindings (#4).

Error handling it's been updated. Every GPU object has its own kind of error (#6).

🛠 General changes

Added CI that checks code format, build status on some platforms and documentation errors. Right now it does not run any example or test (#7).

Docs.rs documentation pulls examples into the usage of some of the functions.

Selection of backend via environment variables is done using Framework::default().

Updated wgpu to 0.12.

Full Changelog: https://github.com/UpsettingBoy/gpgpu-rs/compare/v0.1.0...v0.2.0
Source code(tar.gz)
Source code(zip)
v0.1.0(Nov 4, 2021)
gpgpu is a GPGPU compute framework built on top of wgpu compute pipeline. It tries to be simple to use whilst being easy to integrate with wgpu.

✨ Features

🔖 GPU Compute made simple

gpgpu exposes some GPU objects with clearly defined methods for read and write data from / to the GPU. Some of these objects are:

GpuBuffer: Homogeneous read-write (in shaders) buffer on the GPU.

GpuUniformBuffer: Homogeneous read-only (in shaders) buffer on the GPU. Perfect for small, readonly data.

GpuConstImage and GpuImage: 2D homogeneous image on the GPU. The former is read-only, while the latter is write-only (in shaders).

gpgpu also exposes DescriptorSet and Kernel objects to make easier the management and reuse of binding groups and executions of shaders.

🚩 integrate-image feature for integration with image crate

Compile-time type conversion safety from/to image::ImageBuffer to/from gpgpu::GpuImage and gpgpu::GpuConsyImageobjects

Supported pixel types:

RGBA8 Uint

RGBA8 Uint Norm

RGBA8 Sint

RGBA8 Sint Norm

Bidirectional conversions between image::ImageBuffer and gpgpu::GpuImage or gpgpu::GpuConsyImage

Sync and async read

Sync and async writes

⛰ Enviroment backend selection

Using the env variable WGPU_BACKEND, the gpgpu-rs backend can be selected. Available options are:

Vulkan: "vulkan" or "vk"

DX12: "dx12" or "d3d12"

DX11: "dx11" or "d3d11"

Metal: "metal" or "mtl"

OpenGL ES: "opengl", "gles" or "gl"

WebGPU: "webgpu" Any comma separeted combiantion of any of the previous options is valid. ex. WGPU_BACKEND='vk,gl'

🆘 Help wanted

For an efficient implementation of the conversion between image and gpgpu types, some unsafe code was required. A review of this fragment is very much appreciated 😄 (in src/features/integrate_image.rs 👀).

I'm running out of ideas. The initial goal of this project was to allow me develop a GPU-based image processing library (imageproc_gpu). Since I think gpgpu is enough for that (but still very lacking), I'll continue with that project, adding whatever I need for GPGPU computing back at gpgpu. Any other idea is appreciated.

Full Changelog: https://github.com/UpsettingBoy/gpgpu-rs/compare/v0.0.0-alpha.2...v0.1.0
Source code(tar.gz)
Source code(zip)
v0.0.0-alpha.2(Oct 20, 2021)
This new release includes the capability to convert image::ImageBuffer from the image crate to gpgpu::GpuImage objects and vice versa.

✨ New features

🚩 integration-image feature flag

Compile-time type conversion safety from/to image::ImageBuffer to/from gpgpu::GpuImage objects

Supported pixel types:

RGBA8 Uint

RGBA8 Uint Norm

RGBA8 Sint

RGBA8 Sint Norm

Bidirectional conversions between image::ImageBuffer and gpgpu::GpuImage

Sync and async read

Sync and async writes

⛰ Enviroment backend selection

Using the env variable WGPU_BACKEND, the gpgpu-rs backend can be selected. Available options are:

Vulkan: "vulkan" or "vk"

DX12: "dx12" or "d3d12"

DX11: "dx11" or "d3d11"

Metal: "metal" or "mtl"

OpenGL ES: "opengl", "gles" or "gl"

WebGPU: "webgpu" Any comma separeted combiantion of any of the previous options is valid. ex. WGPU_BACKEND='vk,gl'

📕 Vagrant box

For the development of gpgpu-rs a headless Vagrant box was created. It runs under Ubuntu 21.04. Only the OpenGL ES backend can be used.

For a complete list of the changes since the last release: https://github.com/UpsettingBoy/gpgpu-rs/compare/v0.0.0-alpha.1...v0.0.0-alpha.2

🕹 Usage

Since this version still not published in Crates.io, add it to your Cargo.toml as follows:

[dependencies] gpgpu = { git = "https://github.com/UpsettingBoy/gpgpu-rs", tag = "v0.0.0-alpha.2" }

🆘 Help wanted

For an efficient implementation of the conversion between image and gpgpu-rs types, some unsafe code was required. A review of this fragment is very much appreciated 😄 (in src/features/integrate_image.rs 👀).

I'm running out of ideas. The initial goal of this project was to allow me develop a GPU-based image processing library (imageproc_gpu). Since I think gpgpu-rs is enough for that (but still very lacking), I'll continue with that project, adding whatever I need for GPGPU computing back at gpgpu-rs. Any other idea is appreciated.

Source code(tar.gz)
Source code(zip)
v0.0.0-alpha.1(Sep 27, 2021)
This is the first "real" release of gpgpu-rs, not an MVP yet but still usable.

Features

Offloading of computations into the GPU 📈

Using gpgpu-rs primitives + compute shaders

Supports storage buffers, texture and storage textures on the shaders

Supports SPIR-V compute shader loading (works best with WGSL shaders from both source and SPIR-V)

Fairly primitive wgpu integration. Allows usage of gpgpu-rs primitives with wgpu pipelines (mainly graphic pipelines)

2 simple examples, one for buffer compute and another one for image manipulation

GpuBuffer primitive

Buffer of n T elements on the GPU. T must be bytemuck::Pod

Supports sync and async read and write operations

GpuImage primivite

Handles a texture on the GPU. Only RGBA8 images yet 😢

Supports sync and async read and write operations

TODO

Improve documentation and examples (new and more complex examples)

Support of different texture formats on GpuImage

Reuse of staging buffers (optional, typed as in the Vec allocator)

New primitives for fine grained GPU usage (CpuAccessibleBuffer, ReadOnlyGpuImage, etc; better naming 😆)

More to come...

All in all, I'm pretty happy with this release. I'm looking for comments on the API, if it is simple, easy to use, etc.
Source code(tar.gz)
Source code(zip)

Owner

Jerónimo Sánchez

CS graduate with focus on HPC systems and software. Nowadays working with Rust, C, C++ and C#.

GitHub

Accel: GPGPU Framework for Rust

Accel: GPGPU Framework for Rust crate crates.io docs.rs GitLab Pages accel CUDA-based GPGPU framework accel-core Helper for writing device code accel-

439 Dec 18, 2022

Neural Networks in Rust, without backpropagation. WIP

Deep Thought As of right now, this crate is far from a usable state. This crate implements feedforward-neural Networks in rust. Unlike the vast majori

5 Apr 10, 2022

Rust binding to LuisaCompute (WIP)

luisa-compute-rs Rust binding to LuisaCompute (WIP) Inside this crate: An almost safe binding to LuisaCompute An EDSL for writing kernels A new backen

14 Jan 5, 2023

🚧 WIP 🚧 Vector database plugin for Postgres, written in Rust, specifically designed for LLM.

pgvecto.rs pgvecto.rs is a Postgres extension that provides vector similarity search functions. It is written in Rust and based on pgrx. Features cosi

74 Apr 26, 2023

[WIP] An experimental Java-like language and it's virtual machine, for learning Java and JVM.

Sky VM An experimental Java-like language and it's virtual machine, for learning Java and JVM. Dependencies Rust (rust-lang/rust) 2021 Edition, dual-l

2 Jan 3, 2022

Towards fully autonomous driving (WIP)

?? openpilot openpilot is a comprehensive Rust crate designed to assist in building fully autonomous vehicles. The primary focus of this crate is to p

5 Feb 23, 2024

Simple type-safe relational algebra evaluator built entirely in Rust

ra-evaluator A simple type-safe relational algebra evaluator. Relational algebra provides the theoretical foundation for relational databases and the

4 Aug 8, 2022

A Rust library with homemade machine learning models to classify the MNIST dataset. Built in an attempt to get familiar with advanced Rust concepts.

mnist-classifier Ideas UPDATED: Finish CLI Flags Parallelize conputationally intensive functions Class-based naive bayes README Image parsing Confusio

0 Sep 2, 2021

Simulation of sand falling down in a cave built using nannou (Rust)

nannou-sand-simulation Learning nannou, an open-source creative-coding toolkit for Rust, by implementing a visualization for a simulation of sand fall

3 Dec 20, 2022

A Rust machine learning framework.

Linfa linfa (Italian) / sap (English): The vital circulating fluid of a plant. linfa aims to provide a comprehensive toolkit to build Machine Learning

2.2k Jan 2, 2023

Simple WIP GPGPU framework for Rust built on top of wgpu

Related tags

Overview

gpgpu

Example

Rust program

Shader program

You might also like...

A fast, safe and easy to use reinforcement learning framework in Rust.

A Machine Learning Framework for High Performance written in Rust

Stable Diffusion v1.4 ported to Rust's burn framework

Stable Diffusion XL ported to Rust's burn framework

Open Machine Intelligence Framework for Hackers. (GPU/CPU)

Xaynet represents an agnostic Federated Machine Learning framework to build privacy-preserving AI applications.

Orkhon: ML Inference Framework and Server Runtime

Tangram is an automated machine learning framework designed for programmers.

zenoh-flow aims at providing a zenoh-based data-flow programming framework for computations that span from the cloud to the device.

Comments

Read buffer into user-provided `Vec`

Add `Shader::from_wgsl_string`

Summary

Simplified `Kernel` creation and reuse

Addition of parallel compute example

Releases(v0.2.0)

v0.2.0(Dec 23, 2021)

✨New features

🚩 integrate-ndarray feature for integration with ndarray crate

⚡ Major changes

👩‍💻 API changes

🛠 General changes

v0.1.0(Nov 4, 2021)

✨ Features

🔖 GPU Compute made simple

🚩 integrate-image feature for integration with image crate

⛰ Enviroment backend selection

🆘 Help wanted

v0.0.0-alpha.2(Oct 20, 2021)

✨ New features

🚩 integration-image feature flag

⛰ Enviroment backend selection

📕 Vagrant box

🕹 Usage

🆘 Help wanted

v0.0.0-alpha.1(Sep 27, 2021)

Features

TODO

Owner

Jerónimo Sánchez

Accel: GPGPU Framework for Rust

Neural Networks in Rust, without backpropagation. WIP

Rust binding to LuisaCompute (WIP)

🚧 WIP 🚧 Vector database plugin for Postgres, written in Rust, specifically designed for LLM.

[WIP] An experimental Java-like language and it's virtual machine, for learning Java and JVM.

Towards fully autonomous driving (WIP)

Simple type-safe relational algebra evaluator built entirely in Rust

A Rust library with homemade machine learning models to classify the MNIST dataset. Built in an attempt to get familiar with advanced Rust concepts.

Simulation of sand falling down in a cave built using nannou (Rust)

A Rust machine learning framework.

🚩 `integrate-ndarray` feature for integration with `ndarray` crate

🚩 `integrate-image` feature for integration with `image` crate

🚩 `integration-image` feature flag