Toy library for neural networks in Rust using Vulkan compute shaders

Overview

descent

Toy library for neural networks in Rust using Vulkan compute shaders.

Features

  • Multi-dimensional arrays backed by Vulkan device memory
  • Use Rust syntax to build a computation graph, run as Vulkan compute shaders
    • Supports vector arithmetic and per-element sin/cos/exp/log/etc
    • 1D reduction, 2D matrix multiply, 2D convolutions and 2D max pool supported
    • Gather loads and scatter adds
    • Softmax cross entropy loss
    • Ops are fused into larger compute shaders where possible (to reduce bandwidth cost)
    • Implements broadcasts/padding/windowing/reshapes as views (zero copy) where possible
  • Supports one level of automatic derivatives for back-propagation
  • Some example optimisers:
    • Stochastic gradient descent (with momentum)
    • Adam
  • Optional higher-level API of neural network building blocks
    • Can generate different code for train vs test (e.g. dropout only affects training)
  • Deterministic results (except for scatter add which currently uses float atomics...)

Example Network

The top-level API of neural network building blocks can be used to compactly describe multi-layer networks. Here is a small convolutional neural network with dropout and (leaky) ReLU activation using this API:

struct ConvNet {
    conv1: Conv2D,
    conv2: Conv2D,
    fc1: Dense,
    fc2: Dense,
}

impl ConvNet {
    fn new(env: &mut Environment) -> Self {
        // create and store parameters for layers that require them
        let c1 = 16;
        let c2 = 32;
        let hidden = 128;
        Self {
            conv1: Conv2D::builder(1, c1, 3, 3).with_pad(1).build(env),
            conv2: Conv2D::builder(c1, c2, 3, 3)
                .with_pad(1)
                .with_groups(2)
                .build(env),
            fc1: Dense::builder(7 * 7 * c2, hidden).build(env),
            fc2: Dense::builder(hidden, 10).build(env),
        }
    }
}

impl Module for ConvNet {
    fn eval<'s>(&self, input: DualArray<'s>, ctx: &EvalContext) -> DualArray<'s> {
        // generates ops for the value (forwards) and gradient (backwards) through the layers
        input
            .apply(&self.conv1, ctx)
            .leaky_relu(0.01)
            .max_pool2d((2, 2), (2, 2))
            .apply(&self.conv2, ctx)
            .leaky_relu(0.01)
            .max_pool2d((2, 2), (2, 2))
            .flatten()
            .apply(&Dropout::new(0.5), ctx)
            .apply(&self.fc1, ctx)
            .leaky_relu(0.01)
            .apply(&self.fc2, ctx)
    }
}

See the fashion_mnist example for more networks using this API.

Examples

Please follow the link in the name of each example to show a more detailed description of each one.

Name Description
array_api Demonstrates the low-level Array API for building computation graphs. See the README for more details.
fashion_mnist Trains a few different network types on the Fashion-MNIST dataset. Demonstrates the use of anti-aliasing during max pooling for improved accuracy. See the README for a comparison of network performance.
image_fit Overfits a few different network types to a single RGB image. Compares ReLU with positional encoding to a SIREN network. Update: now also compares to a multi-level hash encoding.

Dependencies

The following crates have been very useful to develop this project:

  • petgraph: used for all graph data structures
  • slotmap: storage with stable keys
  • shaderc: interface to GLSL compiler to generate SPIR-V for shaders

Potential Future Work

  • Lookahead optimiser?
  • Recurrent network
  • SDF fitting
  • Multi-level hash encoding
You might also like...
A neural network model that can approximate any non-linear function by using the random search algorithm for the optimization of the loss function.

random_search A neural network model that can approximate any non-linear function by using the random search algorithm for the optimization of the los

Simple neural network library for classification written in Rust.

Cogent A note I continue working on GPU stuff, I've made some interesting things there, but ultimately it made me realise this is far too monumental a

Rust wrapper for the Fast Artificial Neural Network library

fann-rs Rust wrapper for the Fast Artificial Neural Network (FANN) library. This crate provides a safe interface to FANN on top of the low-level bindi

A fun, hackable, GPU-accelerated, neural network library in Rust, written by an idiot

Tensorken: A Fun, Hackable, GPU-Accelerated, Neural Network library in Rust, Written by an Idiot (work in progress) Understanding deep learning from t

n2 is a library implementation of a feedforward, backpropagation artificial neural network.

n2 is a library implementation of a feedforward, backpropagation artificial neural network. Usage Add the following to the [dependencies] section o

NEATeRS is a library for training a genetic neural net through reinforcement learning.

NEATeRS NEATeRS is a library for training a genetic neural net through reinforcement learning. It uses the NEAT algorithm developed by Ken Stanley whi

A light wheight Neural Network library with a focus on ease of use and speed.

Smarty Pants This goal of this library is to: Produce NeuralNetworks that will always give the same result when given the same input. Provide methods

miniature: a toy deep learning library written in Rust

miniature: a toy deep learning library written in Rust A miniature is a toy deep learning library written in Rust. The miniature is: implemented for a

Comments
  • Basic question

    Basic question

    I am new to Rust, using this project to learn more.

    Basic questions:

    • Do you take the computational graph and then create one kernel on the fly?
    • How do you create a simple 2D (or 3D) array with some data?
    question 
    opened by ib00 2
  • Trying to create a UNet with descent, but a node is not connected

    Trying to create a UNet with descent, but a node is not connected

    Hi, I've been trying to extend descent to be able to create a U-Net (as in https://arxiv.org/abs/1505.04597) with it.

    A U-Net ist basically a bunch of convolutional layers that get concatenated with previous versions of their input.

    I implemented the following two new operations in array.rs:

    Upsampling: https://github.com/apexys/descent/blob/61415476ffaeda734b841e11b533092691c569b4/src/array.rs#L833-L848

    pub fn upsample(self, x_grow_factor: usize, y_grow_factor: usize) -> Self{
        let (a, da) = self.into_inner();
        let input_shape = a.shape();
        assert_eq!(input_shape.len(), 4);
        assert_eq!(a.shape(), da.shape());
        let a_reshaped = a.reshape([input_shape[0], input_shape[1], 1, input_shape[2], 1, input_shape[3]]);
        let da_reshaped = da.reshape([input_shape[0], input_shape[1], 1, input_shape[2], 1, input_shape[3]]);
        let a_broadcasted = a_reshaped.broadcast([input_shape[0], input_shape[1], y_grow_factor, input_shape[2], x_grow_factor, input_shape[3]]);
        let da_broadcasted = da_reshaped.broadcast([input_shape[0], input_shape[1], y_grow_factor, input_shape[2], x_grow_factor, input_shape[3]]);
        let mut output_shape = input_shape;
        output_shape[2] *= y_grow_factor;
        output_shape[1] *= x_grow_factor;
        let a_backshaped = a_broadcasted.reshape(output_shape);
        let da_backshaped = da_broadcasted.reshape(output_shape);
        (a_backshaped, da_backshaped).into()
    }
    

    Cropping: https://github.com/apexys/descent/blob/61415476ffaeda734b841e11b533092691c569b4/src/array.rs#L850-L874

    pub fn crop(self, left: usize, top: usize, right: usize, bottom: usize) -> Self{
        let (a, da) = self.into_inner();
    
        let input_shape = a.shape();
        assert_eq!(input_shape.len(), 4);
    
        let mut input_offsets: TinyVec<[isize; MAX_DIM]> = std::iter::repeat(0).take(input_shape.len()).collect();
        input_offsets[1] = top as isize;
        input_offsets[2] = left as isize;
    
        let mut output_shape = input_shape;
        output_shape[1] -= top + bottom;
        output_shape[2] -= left + right;
    
        let view = View{
            input_shape: a.shape(),
            input_offsets,
            output_mapping: (0..input_shape.len()).map(|i| input_shape.identity_mapping(Axis::from_index(i))).collect(),
            output_shape
        };
    
        let b = a.view(view);
        let db = da.view(view);
        (b, db).into()
    }
    

    The U-Net is then created as a recursive struct, where each layer just applies two Conv-Operations and if it's not the innermost layer, also a cropped and upsampled version of the inner layer, which is then concatenated to the output and goes through another double convolution.

    Here's just the execution part, full code is at https://github.com/apexys/descent_unet_example/blob/b828ce8c4a034d9288f316c108a72229b163f493/src/main.rs

    fn eval<'s>(&self, input: DualArray<'s>, ctx: &EvalContext) -> DualArray<'s> {
        let x = input.apply(&self.conv1, ctx);
        let x = x.apply(&self.conv2, ctx);
        let x = if let Some((pool, inner)) = self.inner.as_ref() {
            let [_, h_outer, w_outer, _]: [usize; 4] = x.shape().try_into().unwrap();
    
            let x_inner = x.apply(pool, ctx);
            let x_inner = inner.eval(x_inner, ctx);
            let [_, h_inner, w_inner, _]: [usize; 4] = x_inner.shape().try_into().unwrap();
            let x_inner = x_inner.upsample(
                w_outer.div_ceil(w_inner),             
                h_outer.div_ceil(h_inner)
            );
            let [_, h_inner, w_inner, _]: [usize; 4] = x_inner.shape().try_into().unwrap();
            assert_eq!(h_inner, w_inner);
            let left = (w_inner - w_outer) / 2;
            let right = (w_inner - w_outer) - left;
            let top = (h_inner - h_outer) / 2;
            let bottom = (h_inner - h_outer) - top;
            let x_inner = x_inner.crop(
                left, top, right, bottom
            );
            x.concat(x_inner, -1)
        } else {
            x
        };
        let x = x.apply(&self.conv3, ctx);
        let x = x.apply(&self.conv4, ctx);
        x
    }
    

    When I create the graph as in I run into the following problem:

    Somewhere in the creation of the graph, a Mov-Operation is created, but no inputs are connected to it.

    I've added logging after every stage of the optimization process, but the error seems to be there from the start (see the upper right Node n295 in https://github.com/apexys/descent_unet_example/blob/main/svgs/after_build_clusters.svg).

    I've also tried isolating the problem and it seems that my upsample and crop operations are "fine", at least I couldn't get the error to show up with just them, but they might still cause problems further down the line.

    Do you have an idea of what might cause this?

    Thank you for your help!

    opened by apexys 6
Owner
Simon Brown
Simon Brown
Rust implementation of real-coded GA for solving optimization problems and training of neural networks

revonet Rust implementation of real-coded genetic algorithm for solving optimization problems and training of neural networks. The latter is also know

Yury Tsoy 19 Aug 11, 2022
Compile-time creation of neural networks with Rust

GAMMA Compile-time creation of neural networks with Rust Description This is for now just a showcase project of what can be done with const generics i

Aitor Ruano 354 Jan 1, 2023
Tensors and dynamic neural networks in pure Rust.

Neuronika is a machine learning framework written in pure Rust, built with a focus on ease of use, fast prototyping and performance. Dynamic neural ne

Neuronika 851 Jan 3, 2023
Neural Networks in Rust, without backpropagation. WIP

Deep Thought As of right now, this crate is far from a usable state. This crate implements feedforward-neural Networks in rust. Unlike the vast majori

null 5 Apr 10, 2022
Compile-time creation of neural networks

Mushin: Compile-time creation of neural networks Mushin is a Japanese term used in martial arts that refers to the state of mind obtained by practice.

Aitor Ruano 354 Jan 1, 2023
🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

?? rust-gpu Rust as a first-class language and ecosystem for GPU graphics & compute shaders Current Status ?? Note: This project is still heavily in d

Embark 5.5k Jan 9, 2023
Compile your WebAssembly programs into SPIR-V shaders

wasm2spirv - Compile your WebAssembly programs into SPIR-V shaders Warning wasm2spirv is still in early development, and not production ready. This re

Alex Andreba 18 Jul 25, 2023
Distributed compute platform implemented in Rust, and powered by Apache Arrow.

Ballista: Distributed Compute Platform Overview Ballista is a distributed compute platform primarily implemented in Rust, powered by Apache Arrow. It

Ballista 2.3k Jan 3, 2023
Python package to compute levensthein distance in rust

Contents Introduction Installation Usage License Introduction Rust implementation of levensthein distance (https://en.wikipedia.org/wiki/Levenshtein_d

Thibault Blanc 2 Feb 21, 2022
Allow DataFusion to resolve queries across remote query engines while pushing down as much compute as possible down.

DataFusion Federation The goal of this repo is to allow DataFusion to resolve queries across remote query engines while pushing down as much compute a

null 15 Mar 11, 2024