Toy library for neural networks in Rust using Vulkan compute shaders

Simon Brown

Last update: Dec 16, 2022

Related tags

Machine learning descent

Overview

descent

Toy library for neural networks in Rust using Vulkan compute shaders.

Features

Multi-dimensional arrays backed by Vulkan device memory
Use Rust syntax to build a computation graph, run as Vulkan compute shaders
- Supports vector arithmetic and per-element sin/cos/exp/log/etc
- 1D reduction, 2D matrix multiply, 2D convolutions and 2D max pool supported
- Gather loads and scatter adds
- Softmax cross entropy loss
- Ops are fused into larger compute shaders where possible (to reduce bandwidth cost)
- Implements broadcasts/padding/windowing/reshapes as views (zero copy) where possible
Supports one level of automatic derivatives for back-propagation
Some example optimisers:
- Stochastic gradient descent (with momentum)
- Adam
Optional higher-level API of neural network building blocks
- Can generate different code for train vs test (e.g. dropout only affects training)
Deterministic results (except for scatter add which currently uses float atomics...)

Example Network

The top-level API of neural network building blocks can be used to compactly describe multi-layer networks. Here is a small convolutional neural network with dropout and (leaky) ReLU activation using this API:

struct ConvNet {
    conv1: Conv2D,
    conv2: Conv2D,
    fc1: Dense,
    fc2: Dense,
}

impl ConvNet {
    fn new(env: &mut Environment) -> Self {
        // create and store parameters for layers that require them
        let c1 = 16;
        let c2 = 32;
        let hidden = 128;
        Self {
            conv1: Conv2D::builder(1, c1, 3, 3).with_pad(1).build(env),
            conv2: Conv2D::builder(c1, c2, 3, 3)
                .with_pad(1)
                .with_groups(2)
                .build(env),
            fc1: Dense::builder(7 * 7 * c2, hidden).build(env),
            fc2: Dense::builder(hidden, 10).build(env),
        }
    }
}

impl Module for ConvNet {
    fn eval<'s>(&self, input: DualArray<'s>, ctx: &EvalContext) -> DualArray<'s> {
        // generates ops for the value (forwards) and gradient (backwards) through the layers
        input
            .apply(&self.conv1, ctx)
            .leaky_relu(0.01)
            .max_pool2d((2, 2), (2, 2))
            .apply(&self.conv2, ctx)
            .leaky_relu(0.01)
            .max_pool2d((2, 2), (2, 2))
            .flatten()
            .apply(&Dropout::new(0.5), ctx)
            .apply(&self.fc1, ctx)
            .leaky_relu(0.01)
            .apply(&self.fc2, ctx)
    }
}

See the fashion_mnist example for more networks using this API.

Examples

Please follow the link in the name of each example to show a more detailed description of each one.

Name	Description
array_api	Demonstrates the low-level `Array` API for building computation graphs. See the README for more details.
fashion_mnist	Trains a few different network types on the Fashion-MNIST dataset. Demonstrates the use of anti-aliasing during max pooling for improved accuracy. See the README for a comparison of network performance.
image_fit	Overfits a few different network types to a single RGB image. Compares ReLU with positional encoding to a SIREN network. Update: now also compares to a multi-level hash encoding.

Dependencies

The following crates have been very useful to develop this project:

petgraph: used for all graph data structures
slotmap: storage with stable keys
shaderc: interface to GLSL compiler to generate SPIR-V for shaders

Potential Future Work

Lookahead optimiser?
Recurrent network
SDF fitting
Multi-level hash encoding

Self Organizing Map (SOM) is a type of Artificial Neural Network (ANN) that is trained using an unsupervised, competitive learning to produce a low dimensional, discretized representation (feature map) of higher dimensional data.

som Self Organizing Map Pre-requisites Setup rust To download Rustup and install Rust, run the following in your terminal, then follow the on-screen i

5 Nov 4, 2020

A neural network model that can approximate any non-linear function by using the random search algorithm for the optimization of the loss function.

random_search A neural network model that can approximate any non-linear function by using the random search algorithm for the optimization of the los

2 Apr 1, 2022

Simple neural network library for classification written in Rust.

Cogent A note I continue working on GPU stuff, I've made some interesting things there, but ultimately it made me realise this is far too monumental a

41 Dec 25, 2022

Rust wrapper for the Fast Artificial Neural Network library

fann-rs Rust wrapper for the Fast Artificial Neural Network (FANN) library. This crate provides a safe interface to FANN on top of the low-level bindi

12 Jul 17, 2022

A fun, hackable, GPU-accelerated, neural network library in Rust, written by an idiot

Tensorken: A Fun, Hackable, GPU-Accelerated, Neural Network library in Rust, Written by an Idiot (work in progress) Understanding deep learning from t

44 May 6, 2023

n2 is a library implementation of a feedforward, backpropagation artificial neural network.

n2 is a library implementation of a feedforward, backpropagation artificial neural network. Usage Add the following to the [dependencies] section o

0 Feb 21, 2021

NEATeRS is a library for training a genetic neural net through reinforcement learning.

NEATeRS NEATeRS is a library for training a genetic neural net through reinforcement learning. It uses the NEAT algorithm developed by Ken Stanley whi

3 Nov 28, 2022

A light wheight Neural Network library with a focus on ease of use and speed.

Smarty Pants This goal of this library is to: Produce NeuralNetworks that will always give the same result when given the same input. Provide methods

3 Mar 7, 2022

miniature: a toy deep learning library written in Rust

miniature: a toy deep learning library written in Rust A miniature is a toy deep learning library written in Rust. The miniature is: implemented for a

4 Nov 29, 2021

Comments

Basic question
I am new to Rust, using this project to learn more.

Basic questions:

Do you take the computational graph and then create one kernel on the fly?

How do you create a simple 2D (or 3D) array with some data?

question
opened by ib00 2

Trying to create a UNet with descent, but a node is not connected

Hi, I've been trying to extend descent to be able to create a U-Net (as in https://arxiv.org/abs/1505.04597) with it.

A U-Net ist basically a bunch of convolutional layers that get concatenated with previous versions of their input.

I implemented the following two new operations in array.rs:

Upsampling: https://github.com/apexys/descent/blob/61415476ffaeda734b841e11b533092691c569b4/src/array.rs#L833-L848

pub fn upsample(self, x_grow_factor: usize, y_grow_factor: usize) -> Self{
    let (a, da) = self.into_inner();
    let input_shape = a.shape();
    assert_eq!(input_shape.len(), 4);
    assert_eq!(a.shape(), da.shape());
    let a_reshaped = a.reshape([input_shape[0], input_shape[1], 1, input_shape[2], 1, input_shape[3]]);
    let da_reshaped = da.reshape([input_shape[0], input_shape[1], 1, input_shape[2], 1, input_shape[3]]);
    let a_broadcasted = a_reshaped.broadcast([input_shape[0], input_shape[1], y_grow_factor, input_shape[2], x_grow_factor, input_shape[3]]);
    let da_broadcasted = da_reshaped.broadcast([input_shape[0], input_shape[1], y_grow_factor, input_shape[2], x_grow_factor, input_shape[3]]);
    let mut output_shape = input_shape;
    output_shape[2] *= y_grow_factor;
    output_shape[1] *= x_grow_factor;
    let a_backshaped = a_broadcasted.reshape(output_shape);
    let da_backshaped = da_broadcasted.reshape(output_shape);
    (a_backshaped, da_backshaped).into()
}

Cropping: https://github.com/apexys/descent/blob/61415476ffaeda734b841e11b533092691c569b4/src/array.rs#L850-L874

pub fn crop(self, left: usize, top: usize, right: usize, bottom: usize) -> Self{
    let (a, da) = self.into_inner();

    let input_shape = a.shape();
    assert_eq!(input_shape.len(), 4);

    let mut input_offsets: TinyVec<[isize; MAX_DIM]> = std::iter::repeat(0).take(input_shape.len()).collect();
    input_offsets[1] = top as isize;
    input_offsets[2] = left as isize;

    let mut output_shape = input_shape;
    output_shape[1] -= top + bottom;
    output_shape[2] -= left + right;

    let view = View{
        input_shape: a.shape(),
        input_offsets,
        output_mapping: (0..input_shape.len()).map(|i| input_shape.identity_mapping(Axis::from_index(i))).collect(),
        output_shape
    };

    let b = a.view(view);
    let db = da.view(view);
    (b, db).into()
}

The U-Net is then created as a recursive struct, where each layer just applies two Conv-Operations and if it's not the innermost layer, also a cropped and upsampled version of the inner layer, which is then concatenated to the output and goes through another double convolution.

Here's just the execution part, full code is at https://github.com/apexys/descent_unet_example/blob/b828ce8c4a034d9288f316c108a72229b163f493/src/main.rs

fn eval<'s>(&self, input: DualArray<'s>, ctx: &EvalContext) -> DualArray<'s> {
    let x = input.apply(&self.conv1, ctx);
    let x = x.apply(&self.conv2, ctx);
    let x = if let Some((pool, inner)) = self.inner.as_ref() {
        let [_, h_outer, w_outer, _]: [usize; 4] = x.shape().try_into().unwrap();

        let x_inner = x.apply(pool, ctx);
        let x_inner = inner.eval(x_inner, ctx);
        let [_, h_inner, w_inner, _]: [usize; 4] = x_inner.shape().try_into().unwrap();
        let x_inner = x_inner.upsample(
            w_outer.div_ceil(w_inner),             
            h_outer.div_ceil(h_inner)
        );
        let [_, h_inner, w_inner, _]: [usize; 4] = x_inner.shape().try_into().unwrap();
        assert_eq!(h_inner, w_inner);
        let left = (w_inner - w_outer) / 2;
        let right = (w_inner - w_outer) - left;
        let top = (h_inner - h_outer) / 2;
        let bottom = (h_inner - h_outer) - top;
        let x_inner = x_inner.crop(
            left, top, right, bottom
        );
        x.concat(x_inner, -1)
    } else {
        x
    };
    let x = x.apply(&self.conv3, ctx);
    let x = x.apply(&self.conv4, ctx);
    x
}

When I create the graph as in I run into the following problem:

Somewhere in the creation of the graph, a Mov-Operation is created, but no inputs are connected to it.

I've added logging after every stage of the optimization process, but the error seems to be there from the start (see the upper right Node n295 in https://github.com/apexys/descent_unet_example/blob/main/svgs/after_build_clusters.svg).

I've also tried isolating the problem and it seems that my upsample and crop operations are "fine", at least I couldn't get the error to show up with just them, but they might still cause problems further down the line.

Do you have an idea of what might cause this?

Thank you for your help!

opened by apexys 6