A repo for learning how to parallelize computations in the GPU using Apple's Metal, in Rust.

Overview

Metal playground in rust

Made for learning how to parallelize computations in the GPU using Apple's Metal, in Rust, via the metal crate.

Overview

The source code will contain documented examples of use, growing in complexity, aimed at a final objective of parallelize a Fast Fourier Transform algorithm.

Contents

  1. dotprod: learn the basics of Metal and metal-rs, implement a simple dot product between two vectors represented as uint arrays.
  2. matrixprod: a more complex example to learn about grid size and thread groups, implement a product between square matrices.
  3. memory: example to show how to shared memory between CPU and GPU. This example simply creates a vector to be modified from the GPU.

Pre-requisites

To run the examples, use the following command:

cargo run --example {example}

To run the memory example some nightly features are needed so +nightly has to be added to the command above.

To re-build all the necessary .metallib files, you can use

make build_metal EXAMPLE={example}

where {example} is the name of the example to run in both commands.

References

You might also like...
A Demo server serving Bert through ONNX with GPU written in Rust with 3

Demo BERT ONNX server written in rust This demo showcase the use of onnxruntime-rs on BERT with a GPU on CUDA 11 served by actix-web and tokenized wit

Wonnx - a GPU-accelerated ONNX inference run-time written 100% in Rust, ready for the web

Wonnx is a GPU-accelerated ONNX inference run-time written 100% in Rust, ready for the web. Supported Platforms (enabled by wgpu) API Windows Linux &

A gpu accelerated (optional) neural network Rust crate.

Intricate A GPU accelerated library that creates/trains/runs neural networks in pure safe Rust code. Architechture overview Intricate has a layout ver

LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!

LLaMa 7b in rust This repo contains the popular LLaMa 7b language model, fully implemented in the rust programming language! Uses dfdx tensors and CUD

A fun, hackable, GPU-accelerated, neural network library in Rust, written by an idiot

Tensorken: A Fun, Hackable, GPU-Accelerated, Neural Network library in Rust, Written by an Idiot (work in progress) Understanding deep learning from t

Signed distance functions + Rust (CPU & GPU) = ❤️❤️
Signed distance functions + Rust (CPU & GPU) = ❤️❤️

sdf-playground Signed distance functions + Rust (CPU & GPU) = ❤️❤️ Platforms: Windows, Mac & Linux. About sdf-playground is a demo showcasing how you

Open Machine Intelligence Framework for Hackers. (GPU/CPU)

Leaf • Introduction Leaf is a open Machine Learning Framework for hackers to build classical, deep or hybrid machine learning applications. It was ins

Damavand is a quantum circuit simulator. It can  run on laptops or High Performance Computing architectures, such CPU distributed architectures or multi GPU distributed architectures.
Damavand is a quantum circuit simulator. It can run on laptops or High Performance Computing architectures, such CPU distributed architectures or multi GPU distributed architectures.

Damavand is a code that simulates quantum circuits. In order to learn more about damavand, refer to the documentation. Development status Core feature

A library implementing GPU-accelerated cryptographic functionality for the zkSync prover.
A library implementing GPU-accelerated cryptographic functionality for the zkSync prover.

zkSync Era: A ZK Rollup For Scaling Ethereum zkSync Era is a layer 2 rollup that uses zero-knowledge proofs to scale Ethereum without compromising on

Comments
  • Make FP class generic enough to accept any `P` value

    Make FP class generic enough to accept any `P` value

    This issue can be divided into two tasks:

    1. Remove M and R2 constants and update operations so that results are still valid. This will be less efficent than the original implementation.
    2. Convert FP class to a template that takes P as an argument: template <uint32_t P> class Fp. If you want examples of a template class, see: https://github.com/andrewmilson/ministark/blob/main/gpu-poly/src/metal/felt_u128.h.metal#L50-L54
    opened by mpaulucci 0
Owner
Lambdaclass
A venture studio and a engineering powerhouse at the same time
Lambdaclass
zenoh-flow aims at providing a zenoh-based data-flow programming framework for computations that span from the cloud to the device.

Eclipse Zenoh-Flow Zenoh-Flow provides a zenoh-based dataflow programming framework for computations that span from the cloud to the device. ⚠️ This s

null 35 Dec 12, 2022
Rust based Cross-GPU Machine Learning

HAL : Hyper Adaptive Learning Rust based Cross-GPU Machine Learning. Why Rust? This project is for those that miss strongly typed compiled languages.

Jason Ramapuram 83 Dec 20, 2022
Open deep learning compiler stack for cpu, gpu and specialized accelerators

Open Deep Learning Compiler Stack Documentation | Contributors | Community | Release Notes Apache TVM is a compiler stack for deep learning systems. I

The Apache Software Foundation 8.9k Jan 4, 2023
A real-time implementation of "Ray Tracing in One Weekend" using nannou and rust-gpu.

Real-time Ray Tracing with nannou & rust-gpu An attempt at a real-time implementation of "Ray Tracing in One Weekend" by Peter Shirley. This was a per

null 89 Dec 23, 2022
Flexible, reusable reinforcement learning (Q learning) implementation in Rust

Rurel Rurel is a flexible, reusable reinforcement learning (Q learning) implementation in Rust. Release documentation In Cargo.toml: rurel = "0.2.0"

Milan Boers 60 Dec 29, 2022
A demo repo that shows how to use the latest component model feature in wasmtime to implement a key-value capability defined in a WIT file.

Key-Value Component Demo This repo serves as an example of how to use the latest wasm runtime wasmtime and its component-model feature to build and ex

Jiaxiao Zhou 3 Dec 20, 2022
Ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust.

Ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust.

Riccardo D'Ambrosio 2.1k Jan 5, 2023
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

The Rust CUDA Project An ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust Guide | Getting Started | Fe

Rust GPU 2.1k Dec 30, 2022
How to: Run Rust code on your NVIDIA GPU

Status This documentation about an unstable feature is UNMAINTAINED and was written over a year ago. Things may have drastically changed since then; r

null 343 Dec 22, 2022
🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

?? rust-gpu Rust as a first-class language and ecosystem for GPU graphics & compute shaders Current Status ?? Note: This project is still heavily in d

Embark 5.5k Jan 9, 2023