This is the first release of autograph rebuilt on SPIR-V compute shaders that can be compiled from Rust source with rust-gpu!
Compute Shaders
All computations are implemented in either Rust or GLSL (to be replaced by Rust), and this API is publicly exposed so that external crates can develop their own routines. Shader code targeting SPIR-V is portable and is compiled at runtime for devices supporting Vulkan, Metal, and DX12 API's.
Datasets
The library includes MNIST and Iris datasets to make it easy to get started and these are used in examples.
Machine Learning
High level traits like Train, Test, and Infer are provided to create a common interface for different algorithms.
KMeans
An implementation of the KMeans classifier, demonstrated in the examples.
Neural Networks
Networks can be constructed as a structure of Layers, including:
- Convolutions
- ReLU
- MaxPool
- Dense
Each of these layers implement Layer and Forward traits, which can be derived to reduce boiler plate.
#[derive(Layer, Forward, Clone, Debug, Serialize, Deserialize)]
struct Lenet5 {
#[autograph(layer)]
conv1: Conv,
#[autograph(layer)]
relu1: Relu,
#[autograph(layer)]
pool1: MaxPool,
#[autograph(layer)]
conv2: Conv,
#[autograph(layer)]
relu2: Relu,
#[autograph(layer)]
pool2: MaxPool,
#[autograph(layer)]
dense1: Dense,
#[autograph(layer)]
relu3: Relu,
#[autograph(layer)]
dense2: Dense,
#[autograph(layer)]
relu4: Relu,
#[autograph(layer)]
dense3: Dense,
}
Similarly, backward ops can be defined using the Autograd and Backward traits, where Autograd can be derived in much the same way that Layer is.
#[derive(Autograd)]
struct DenseBackward {
// Use vertex / optional_vertex for Variables and Parameters
#[autograph(vertex)]
input: Variable2,
#[autograph(vertex)]
weight: Parameter2,
#[autograph(optional_vertex)]
bias: Option<Parameter1>,
}
The intent is that users can write their own custom, modular layers and functions which can be defined from the high level down to custom shader code, all implemented in Rust.
Status
The crate is fairly minimal, missing implementations for some data types, not supporting bf16 for convolutions and pooling layers, with many functions like matrix multiplication internal and not publicly exposed. Things that are potential work items:
- Fully support bf16 in Neural Networks, with a nicer means to convert from f32 to bf16 and back for Variables and Parameters.
- Render the backward "graph" using petgraph for visualization and debugging purposes.
- Profiling tools for evaluating key functions / shaders and for improving the engine itself.
- Port GLSL to Rust, rust-gpu barriers are not working yet and need to reduce the need for code duplication particularly for bf16.
- Improve performance, particularly the GEMM implementation.
- Implement more operations and algorithms:
- MeanPool is implemented but backward is not yet working.
- Binary ops like addition are easy but not yet implemented due to uncertainty over API (in regards to Residual layers etc with more than 2 inputs).
- SGD with momentum not yet implemented, implement other optimizers.
- Model parallelism supported but not tested or optimized. Data parallelism is intended to override Layer::update() to perform an all reduce (ie mean) over the the gradients for each parameter duplicated on several devices prior to the optimization step.
Contributors
Thank you to those that have contributed to the project!
Source code(tar.gz)
Source code(zip)