Tensors and dynamic neural networks in pure Rust.

Overview

Neuronika Logo


Neuronika is a machine learning framework written in pure Rust, built with a focus on ease of use, fast prototyping and performance.

Dynamic neural networks and auto-differentiation

At the core of Neuronika lies a mechanism called reverse-mode automatic differentiation, that allows you to define dynamically changing neural networks with very low effort and no overhead by using a lean, fully imperative and define by run API.

The power of Rust

The Rust language allows for an intuitive, light and easy to use interface while achieving incredible performance. There's no need for a FFI, everything happens in front of your eyes.

Crate Feature Flags

The following crate feature flags are available. They configure the ndarray backend.

  • serialize

    • Enables serialization support for serde 1.x.
  • blas

    • Enables transparent BLAS support for matrix multiplication. Uses blas-src for pluggable backend, which needs to be configured separately. See here for more informations.
  • matrixmultiply-threading

Project Status

Neuronika is very young and rapidly evolving, we are continously developing the project and breaking changes are expected during transitions from version to version. We adopt the newest stable Rust's features if we need them.

Comments
  • Is it possible to dynamically construct compute graph?

    Is it possible to dynamically construct compute graph?

    It seems the laziness of the compute graph construction does not allow dynamic control flows that depend on data values?

    For example, in PyTorch, we can do:

    def foo(x):
        if x > 5: # suppose x is a scalar tensor
            return x * 2
        else:
            return x 
    

    And we can even have loops that depend on the values and/or shapes of tensors. Now it seems neuronika does not have such capabilities, right? Because before forward(), everything in the compute graph is symbolic(?).

    I don't know when a compute graph is destroyed, but if a compute graph can dynamically be a subgraph of a larger one, then it seems possible to have something like:

    let x = neuronika::rand((1,)).requires_grad();
    let y = x * 2;
    let y_val = y.forward(); // evaluate a value from a subgraph
    let z = if y_val > 1.0{
        y * 2
    }else{
        y * 3
    }; // add more compute nodes based on y_val
    z.forward(); // evaluate a whole graph. Probably with redundant computations on the above subgraph? Or is there any cache?
    z.backward();
    
    opened by ifsheldon 12
  • Neural Networks

    Neural Networks

    The trait Model

    I propose to create a public trait that users can implement to create their custom models. I provide a sketch hereafter.

    pub trait Model {
        fn forward(&self, (...) );
    }
    
    impl Model for MyNet {
        // Something something machine learning.
    }
    

    ~~The callable question~~

    ~~Should we make use of this very cool but nightly feature? This would allow for something on the lines of this:~~

    let net = MyModel::new();
    let prediction = net(data); // Maybe sometime in the future =(
    

    ~~I personally would love to have such an API.~~ Let's stick with stable.

    Layers' initialisation functions

    These are a must, we should take a quick look at this and then decide. Also we should look at that and this, I think there features will probably be useful for our intents.

    The essentials in my opinion:

    • [x] Uniform
    • [x] Normal
    • [x] Constant
    • [x] Ones
    • [x] Zeros
    • [x] Eye
    • [x] ~~Dirac~~ we should probably consider this at a later time since it is used only on n-dimensional tensors with n >=3
    • [x] Xavier Uniform
    • [x] Xavier Normal

    These should be easy and straightforward to implement.

    Type of Layers

    The essentials in my opinion:

    • [x] Linear
    • [x] Recurrent
    • [x] Conv1d
    • [x] Conv2d
    • [x] Conv3d
    • [x] GRU
    • [x] LSTM
    • [x] Dropout

    Useful links

    documentation enhancement help wanted question urgent 
    opened by frjnn 12
  • Leaky_relu exploding values

    Leaky_relu exploding values

    Hello,

    Basically, The following code produces exploding gradient, values and losses. I've tested with different optimizers (SGD, Adagrad, Adam) for multiple activation functions (softplus, relu and leaky_relu).

    Optimisers parameters change the time to convergence, but basically, while softplus and relu stabilizes around the correct answer with relatively small losses, waiting for a sufficiently large number of epochs with lead to exploding values with leaky_relu:

    [...]
    epoch 181 loss: 18083050
    epoch 182 loss: 14513438
    epoch 183 loss: 211705540
    epoch 184 loss: 113424490
    epoch 185 loss: 4869084000
    epoch 186 loss: 981610800
    epoch 187 loss: 225812140000
    epoch 188 loss: 2468554200000
    epoch 189 loss: 131793980000000
    epoch 190 loss: 24977954000000000
    epoch 191 loss: 88780490000000000000
    epoch 192 loss: 138728520000000000000000000
    epoch 193 loss: 105674000000000000000000000000000000000
    epoch 194 loss: NaN
    epoch 195 loss: NaN
    epoch 196 loss: NaN
    epoch 197 loss: NaN
    epoch 198 loss: NaN
    epoch 199 loss: NaN
    [...]
    

    I'm not sure how to completely seed the learning process, but I can make this behavior happen consistently. ReLU and Softplus should offer relatively similar performances I think, so there probably is some kind of computational mistake here ( I assume it's not my code since I've tested many things)

    use ndarray::{Ix4, stack, Axis, Array, Dim};
    
    use neuronika::Param;
    use neuronika::{Data, Gradient, Overwrite, VarDiff, Convolve, from_ndarray};
    use neuronika::optim::{Adam, L2};
    use neuronika::nn::{ModelStatus, Conv2d, Learnable};
    use neuronika::nn;
    use neuronika::nn::init::{calculate_gain, xavier_normal};
    
    // Network definition.
    pub struct NeuralNetwork {
    	con1: Conv2d<nn::Zero>,
    	con2: Conv2d<nn::Zero>,
    	con3: Conv2d<nn::Zero>,
    	con4: Conv2d<nn::Zero>,
    	status: ModelStatus,     
    }
    
    impl NeuralNetwork {
    	fn new() -> Self {
    		let mut status = ModelStatus::default();
    
    		let con1 = status.register(
    			Conv2d::new(5, 20, (5, 5), (0, 0), nn::Zero, (1, 1), (1, 1))
    		);
    		let con2 = status.register(
    			Conv2d::new(20, 40, (5, 5), (0, 0), nn::Zero, (1, 1), (1, 1))
    		);
    		let con3 = status.register(
    			Conv2d::new(40, 20, (2, 2), (0, 0), nn::Zero, (1, 1), (1, 1))
    		);
    		let con4 = status.register(
    			Conv2d::new(20, 1, (1, 1), (0, 0), nn::Zero, (1, 1), (1, 1))
    		);
    		
    		Self {
    			con1: con1,
    			con2: con2,
    			con3: con3,
    			con4: con4,
    			status,
    		}
    	}
    
        fn forward<I, T, U>(
            &self,
            input: I,
        ) -> VarDiff<impl Data<Dim = Ix4>, impl Gradient<Dim = Ix4> + Overwrite>
        where
    		I: Convolve<I, Learnable<Ix4>, nn::Zero>,
    		I::Output: Into<VarDiff<T, U>>,
    		T: Data<Dim = Ix4>,
    		U: Gradient<Dim = Ix4> + Overwrite,
        {
    		let out1 = self.con1.forward(input).leaky_relu();
    		let out2 = self.con2.forward(out1).leaky_relu();
    		let out3 = self.con3.forward(out2).leaky_relu();
    		let out4 = self.con4.forward(out3);
    		out4
        }
    
    
    	pub fn parameters(&self) -> Vec<Param> {
    		self.status.parameters()
    	}
    
    	/// Switches the network in training mode.
    	fn train(&self) {
    		self.status.train()
    	}
    }
    
    
    fn main(){
    	let model = NeuralNetwork::new();
    	let model_optim = neuronika::optim::SGD::new(model.parameters(), 0.001, neuronika::optim::L2::new(0.));
    	model.train();
    	
    	// input data should be irrevelant here (see target below)
    	let data_point: Array<f32, Dim<[usize; 4]>> = stack(Axis(0), &[Array::eye(10).view(), Array::eye(10).view(), Array::eye(10).view(), Array::eye(10).view(), Array::eye(10).view()]).unwrap().into_shape((1, 5, 10, 10)).unwrap(); 
    	
    	for i in 0..600{
    	    let x = model.forward(from_ndarray(data_point.clone()));
    	    
    	    // model should learn constant function f(x) = 10
    	    let y = from_ndarray(Array::from_vec(vec![10.]).into_shape((1, 1, 1, 1)).unwrap());
    	    
    	    let loss = nn::loss::mae_loss(x.clone(), y, nn::loss::Reduction::Mean);
    	    loss.forward();
    	    loss.backward(1.);
    	    
    	    println!("epoch {} loss: {}", i, &loss.data());
    	    
    	    model_optim.step();
    	}
    	
    }
    
    opened by lodykas 5
  • An error occur when i run the example

    An error occur when i run the example

    Hi, I am a beginner of rust, when i run the example code of "quickstart" that is in the directory of examples, an error occur, the error information is :

    error[E0277]: the trait bound NeuralNetwork: serde::de::Deserialize<'_> is not satisfied --> src/main.rs:51:21 | 51 | let json_data = serde_json::from_str( | ^^^^^^^^^^^^^^^^^^^^ the trait serde::de::Deserialize<'_> is not implemented for NeuralNetwork | note: required by a bound in serde_json::from_str --> /home/lin/.cargo/registry/src/github.com-1ecc6299db9ec823/serde_json-1.0.79/src/de.rs:2611:8 | 2611 | T: de::Deserialize<'a>, | ^^^^^^^^^^^^^^^^^^^ required by this bound in serde_json::from_str

    I know the mean, but need i to write the code of "Deserialize trait" or , maybe, the reason is the error config of my toml file.

    The content of my toml file is :

    [dependencies] neuronika = { git = "https://github.com/neuronika/neuronika" } csv = "1.1.6" itertools = "0.10.3" ndarray = {version = "0.15.4", features = ["rayon"]} ndarray-rand = "0.14.0" rand = "0.8.4" rand_distr = "0.4.2" rayon = "1.5.1" serde = {version = "1.0.136", features = ["derive"]} serde_json = "1.0.79"

    [features] blas = ["ndarray/blas"] matrixmultiply-threading = ["ndarray/matrixmultiply-threading"] serialize = ["ndarray/serde"]

    Help me, please. Thank you .

    opened by xuexl 4
  • Enhancement of reduction procedure

    Enhancement of reduction procedure

    Description of the problem

    Let's talk for a moment about the reduction mechanics implemented in neuronika. Currently, during a backward pass, we "shrink" the gradient entering each node by calling the two functions reduce and sum_axis_inplace.

    Functions Code
    fn sum_axis_inplace(arr: &mut DynTensor, axis: Axis) {
        let (first, rest) = arr.view_mut().split_at(axis, 1);
        Zip::from(first.remove_axis(axis))
            .and(rest.lanes(axis))
            .for_each(|dst, src| *dst += src.sum());
        arr.index_axis_inplace(axis, 0);
    }
    
    pub fn reduce<D: Dimension, E: Dimension>(dest: &Tensor<D>, src: &Tensor<E>) -> DynTensor {
        let mut dyn_rhs = src.clone().into_dyn(); // <- Very costly operation!
    
        unsafe {
            while (*(&dyn_rhs as *const DynTensor)).ndim() > dest.ndim() {
                sum_axis_inplace(&mut dyn_rhs, Axis(0));
            }
        }
    
        for (axis, size) in dest.shape().iter().enumerate() {
            if *size == 1 {
                sum_axis_inplace(&mut dyn_rhs, ndarray::Axis(axis));
                dyn_rhs.insert_axis_inplace(ndarray::Axis(axis));
            }
        }
    
        dyn_rhs
    }
    

    As you can see in the first line of reduce, every time it is executed it does a .clone(), but since the type on which it's executed is an Array that uniquely owns its content, we are actually allocating and copying it. Also, in a scenario where src and dest shapes are equal, this operation is completely useless as we can simply add them together.

    Possible solution

    One way to avoid all of these penalties is to introduce a new Reducer structure that contains a buffer with the same capacity as the largest gradient on which the reduction has to be performed. If and when this operation has to be performed, the contents of the input tensor are copied into the buffer, which is then used to generate a new ArrayViewMut on which the actual accumulation will take place.

    Solution Code
    pub struct Reducer {
        buffer: Vec<f32>,
    }
    
    impl Reducer {
        fn sum_axis_inplace(arr: &mut ArrayViewMut<f32, IxDyn>, axis: Axis) {
            let (first, rest) = arr.view_mut().split_at(axis, 1);
            Zip::from(first.remove_axis(axis))
                .and(rest.lanes(axis))
                .for_each(|dst, src| *dst += src.sum());
            arr.index_axis_inplace(axis, 0);
        }
    
        fn load(&mut self, src: &ArrayView<f32, IxDyn>) {
            let src_len = src.len();
    
            assert!(src_len <= self.buffer.capacity());
    
            unsafe {
                ptr::copy_nonoverlapping(src.as_ptr(), self.buffer.as_mut_ptr(), src_len);
                self.buffer.set_len(src_len);
            }
        }
    
        pub fn new(cap: usize) -> Self {
            assert_ne!(cap, 0);
    
            Self {
                buffer: Vec::with_capacity(cap),
            }
        }
    
        pub fn reduce<D: Dimension>(&mut self, dst: &mut Array<f32, D>, src: ArrayView<f32, IxDyn>) {
            let excess = src.ndim() - dst.ndim();
            let to_shrink = src.shape()[excess..src.ndim()]
                .iter()
                .enumerate()
                .filter(|(_, len)| **len == 1)
                .map(|(id, _)| id)
                .collect::<Vec<_>>();
    
            if excess == 0 && to_shrink.is_empty() {
                // dst.assign(&src);
                return;
            }
    
            self.load(&src);
            let mut src =
                unsafe { ArrayViewMut::from_shape_ptr(src.raw_dim(), self.buffer.as_mut_ptr()) };
    
            for _ in 0..excess {
                Reducer::sum_axis_inplace(&mut src, Axis(0));
            }
    
            for axis in to_shrink {
                Reducer::sum_axis_inplace(&mut src, Axis(axis));
                src.insert_axis_inplace(Axis(axis));
            }
    
            // dst.assign(&src);
        }
    }
    

    Test Configuration

    Both these versions have been tested on the following code with the --release flag twice, one with valgrind to assess the memory usage, and one without for the completion time.

    New Code Test
    const TRIES: usize = 16;
    
    let mut reducer = Reducer::new(1_024 * 1_024 * 4);
    let src = Array::from_elem((1_024, 1_024, 4), 1.);
    let mut dst = Array::from_elem((1_024, 4), 0.);
    let res = (0..TRIES)
        .map(|_| {
            let start = Instant::now();
            for _ in 0..1_024 {
                reducer.reduce(&mut dst, src.view().into_dyn());
            }
            start.elapsed().as_millis()
        })
        .sum::<u128>();
    println!(
        "Mean elapsed time: {} milliseconds",
        res as f64 / TRIES as f64
    );
    
    Old Code Test
    const TRIES: usize = 16;
    
    let src = Array::from_elem((1_024, 1_024, 4), 1.);
    let dst = Array::from_elem((1_024, 4), 0.);
    let res = (0..TRIES)
        .map(|_| {
            let start = Instant::now();
            for _ in 0..1_024 {
                neuronika::variable::node::reduce(&dst, &src);
            }
            start.elapsed().as_millis()
        })
        .sum::<u128>();
    println!(
        "Mean elapsed time: {} milliseconds",
        res as f64 / TRIES as f64
    );
    

    Results

    VersionIterationsSourceDestinationTime (ms)AllocationsBytes Used
    Current1'024(1'024, 1'024, 4)(1'024, 4)10'931.811,03717,196,664,945
    Proposed9'561.561433,572,977

    Looking at the numbers, we can see that this new version, even on only 1'024 iterations, is almost 1.5 seconds faster, and uses 512 times less memory.

    enhancement help wanted 
    opened by ste-pac 4
  • Redesign of the nodes infrastructure

    Redesign of the nodes infrastructure

    This issue is related to the redesign of the basic infrastructure used by the library to autodiff.

    Currently we manage the propagation of gradients and the invocations of the .forward() and .backward() methods through recursive calls, controlled by appropriate control structures such as ForwardAction andBackwardAction to avoid performing out multiple times some computations. Unfortunately, this organization has the disadvantage of leading to the creation of a large number of recursive calls on the stack, with consequent loss of performance. The new approach that we could take advantage of is based on moving to an iterative version using trait objects, which even if they have the disadvantage of using dynamic dispatching, could lead to a simpler implementation, as well as a potential performance improvement.

    A possible implementation example is the following.

    Prototype Code
    use indexmap::IndexMap;
    use ndarray::{Array, DimMax, Dimension, RawArrayView, ShapeError, StrideShape, Zip};
    use std::{
        cell::{Ref, RefCell},
        ops::{Deref, DerefMut},
        rc::Rc,
    };
    
    pub(crate) type Broadcasted<Lhs, Rhs> = <Lhs as DimMax<Rhs>>::Output;
    pub(crate) type BroadTensor<Lhs, Rhs> = Tensor<Broadcasted<Lhs, Rhs>>;
    pub(crate) type Tensor<D> = Array<f32, D>;
    pub(crate) type RawTensorView<D> = RawArrayView<f32, D>;
    pub(crate) type RawBroadTensorView<Lhs, Rhs> = RawArrayView<f32, Broadcasted<Lhs, Rhs>>;
    
    // ============================================= Utils =============================================
    
    fn broadcasted_zero<Lhs, Rhs>(
        left: Ref<Tensor<Lhs>>,
        right: Ref<Tensor<Rhs>>,
    ) -> BroadTensor<Lhs, Rhs>
    where
        Lhs: Dimension + DimMax<Rhs>,
        Rhs: Dimension,
    {
        let (mut bigger, smaller) = if left.ndim() >= right.ndim() {
            (left.shape().to_vec(), right.shape())
        } else {
            (right.shape().to_vec(), left.shape())
        };
        for (l, r) in bigger.iter_mut().rev().zip(smaller.iter().rev()) {
            *l = std::cmp::max(*l, *r);
        }
        let total = bigger.iter().product();
        Tensor::from_shape_vec(bigger, vec![0.; total])
            .unwrap()
            .into_dimensionality::<Broadcasted<Lhs, Rhs>>()
            .unwrap()
    }
    
    // ========================================= Forward Nodes =========================================
    
    pub trait Node {
        type Dim: Dimension;
    
        fn uid(&self) -> usize;
    
        fn value(&self) -> Ref<Tensor<Self::Dim>>;
    }
    
    pub trait Forward {
        fn forward(&self);
    }
    
    // Input
    pub struct InputNode<D: Dimension> {
        uid: usize,
        value: RefCell<Tensor<D>>,
    }
    
    impl<D: Dimension> InputNode<D> {
        pub fn new<Sh>(uid: usize, shape: Sh, vec: Vec<f32>) -> Result<Self, ShapeError>
        where
            Sh: Into<StrideShape<D>>,
        {
            Ok(Self {
                uid,
                value: RefCell::new(Array::from_shape_vec(shape, vec)?),
            })
        }
    }
    
    impl<D: Dimension> Node for InputNode<D> {
        type Dim = D;
    
        fn uid(&self) -> usize {
            self.uid
        }
    
        fn value(&self) -> Ref<Tensor<D>> {
            self.value.borrow()
        }
    }
    
    impl<D: Dimension> Forward for InputNode<D> {
        fn forward(&self) {
            // Nothing
        }
    }
    
    // Parameter
    pub struct ParameterNode<D: Dimension> {
        uid: usize,
        value: RefCell<Tensor<D>>,
    }
    
    impl<D: Dimension> ParameterNode<D> {
        pub fn new<Sh>(uid: usize, shape: Sh, vec: Vec<f32>) -> Result<Self, ShapeError>
        where
            Sh: Into<StrideShape<D>>,
        {
            Ok(Self {
                uid,
                value: RefCell::new(Array::from_shape_vec(shape, vec)?),
            })
        }
    }
    
    impl<D: Dimension> Node for ParameterNode<D> {
        type Dim = D;
    
        fn uid(&self) -> usize {
            self.uid
        }
    
        fn value(&self) -> Ref<Tensor<D>> {
            self.value.borrow()
        }
    }
    
    impl<D: Dimension> Forward for ParameterNode<D> {
        fn forward(&self) {
            // Nothing
        }
    }
    
    // Addition
    pub struct AdditionNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        uid: usize,
        left: Rc<Lhs>,
        right: Rc<Rhs>,
        value: RefCell<BroadTensor<Lhs::Dim, Rhs::Dim>>,
    }
    
    impl<Lhs, Rhs> AdditionNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        pub fn new(uid: usize, left: Rc<Lhs>, right: Rc<Rhs>) -> Self {
            let value = RefCell::new(broadcasted_zero(left.value(), right.value()));
            Self {
                uid,
                left,
                right,
                value,
            }
        }
    }
    
    impl<Lhs, Rhs> Node for AdditionNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        type Dim = Broadcasted<Lhs::Dim, Rhs::Dim>;
    
        fn uid(&self) -> usize {
            self.uid
        }
    
        fn value(&self) -> Ref<Tensor<Self::Dim>> {
            self.value.borrow()
        }
    }
    
    impl<Lhs, Rhs> Forward for AdditionNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        fn forward(&self) {
            Zip::from(self.value.borrow_mut().deref_mut())
                .and_broadcast(self.left.value().deref())
                .and_broadcast(self.right.value().deref())
                .par_for_each(|v, l, r| *v = l + r);
        }
    }
    
    // ============================================ Backward Nodes ============================================
    
    pub trait DiffNode: Node {
        fn connect_source(&mut self, node_uid: usize, node_view: RawTensorView<Self::Dim>);
    
        fn disconnect_source(&mut self, node_id: usize);
    }
    
    pub trait Backward: Forward {
        fn backward(&self);
    }
    
    // Addition
    pub struct AdditionDiffNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        addition: AdditionNode<Lhs, Rhs>,
        gradient: RefCell<BroadTensor<Lhs::Dim, Rhs::Dim>>,
        sources: IndexMap<usize, RawBroadTensorView<Lhs::Dim, Rhs::Dim>>,
    }
    
    impl<Lhs, Rhs> AdditionDiffNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        pub fn new(addition: AdditionNode<Lhs, Rhs>) -> Self {
            let gradient = RefCell::new(Tensor::zeros(addition.value().raw_dim()));
            Self {
                addition,
                gradient, // ! To be changed into two separate gradients
                sources: IndexMap::new(),
            }
        }
    }
    
    impl<Lhs, Rhs> Node for AdditionDiffNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        type Dim = Broadcasted<Lhs::Dim, Rhs::Dim>;
    
        fn uid(&self) -> usize {
            self.addition.uid()
        }
    
        fn value(&self) -> Ref<Tensor<Self::Dim>> {
            self.addition.value()
        }
    }
    
    impl<Lhs, Rhs> Forward for AdditionDiffNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        fn forward(&self) {
            self.addition.forward();
        }
    }
    
    impl<Lhs, Rhs> DiffNode for AdditionDiffNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        fn connect_source(&mut self, node_id: usize, node_view: RawTensorView<Self::Dim>) {
            assert!(
                self.sources.insert(node_id, node_view).is_none(),
                "Node {} already connected to {}",
                node_id,
                self.addition.uid()
            );
        }
    
        fn disconnect_source(&mut self, source_id: usize) {
            assert!(
                self.sources.remove(&source_id).is_some(),
                "Node {} isn't connected to {}",
                source_id,
                self.addition.uid()
            );
        }
    }
    
    impl<Lhs, Rhs> Backward for AdditionDiffNode<Lhs, Rhs>
    where
        Lhs: Node,
        Rhs: Node,
        Lhs::Dim: Dimension + DimMax<Rhs::Dim>,
    {
        fn backward(&self) {
            if self.sources.is_empty() {
                return;
            }
    
            let mut gradient = self.gradient.borrow_mut();
            let mut sources = self
                .sources
                .values()
                .map(|v| unsafe { v.clone().deref_into_view() });
    
            gradient.assign(&sources.next().unwrap());
            for source in sources {
                Zip::from(gradient.deref_mut())
                    .and_broadcast(source)
                    .par_for_each(|l, r| *l += r);
            }
        }
    }
    
    // =================================================== Tests ===================================================
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        mod api {
            use super::*;
            use rand::{self, Rng};
    
            #[test]
            fn addition_benchmark() {
                let forward_nodes: Vec<Rc<dyn Forward>> = vec![
                    Rc::new(AdditionNode::new(
                        0,
                        Rc::new(InputNode::new(0, (1_000, 1_000), vec![1.; 1_000 * 1_000]).unwrap()),
                        Rc::new(InputNode::new(0, (1_000, 1_000), vec![1.; 1_000 * 1_000]).unwrap()),
                    ));
                    1_024
                ];
    
                let mut times = Vec::with_capacity(1_024);
                for node in &forward_nodes {
                    let start = std::time::Instant::now();
                    node.forward();
                    let stop = start.elapsed();
                    times.push(stop.as_micros());
                }
                println!(
                    "Mean Forward Time Iteration: {} microseconds",
                    times.iter().sum::<u128>() / times.len() as u128
                );
    
                let start = std::time::Instant::now();
                for node in &forward_nodes {
                    node.forward();
                }
                let elapsed = start.elapsed();
                println!("Mean Forward Time: {} milliseconds", elapsed.as_millis());
            }
    
            #[test]
            fn backward_addition_benchmark() {
                let gradient_sources = vec![Tensor::from_elem((1_000, 1_000), 0.); 128];
    
                let mut addition_nodes = Vec::with_capacity(1_024);
                for _ in 0..1_024 {
                    addition_nodes.push(AdditionDiffNode::new(AdditionNode::new(
                        0,
                        Rc::new(InputNode::new(0, (1_000, 1_000), vec![1.; 1_000 * 1_000]).unwrap()),
                        Rc::new(InputNode::new(0, (1_000, 1_000), vec![1.; 1_000 * 1_000]).unwrap()),
                    )));
                }
    
                let mut rng = rand::thread_rng();
                for node in &mut addition_nodes {
                    node.connect_source(0, gradient_sources[rng.gen_range(0..128)].raw_view());
                    node.connect_source(1, gradient_sources[rng.gen_range(0..128)].raw_view());
                    node.connect_source(2, gradient_sources[rng.gen_range(0..128)].raw_view());
                    node.connect_source(3, gradient_sources[rng.gen_range(0..128)].raw_view());
                }
    
                let mut backward_nodes: Vec<Rc<dyn Backward>> = Vec::with_capacity(1_024);
                for _ in 0..1_024 {
                    backward_nodes.push(Rc::new(addition_nodes.swap_remove(0)));
                }
    
                let mut times = Vec::with_capacity(1_024);
                for node in &backward_nodes {
                    let start = std::time::Instant::now();
                    node.backward();
                    let stop = start.elapsed();
                    times.push(stop.as_micros());
                }
                println!(
                    "Mean Backward Time Iteration: {} microseconds",
                    times.iter().sum::<u128>() / times.len() as u128
                );
    
                let start = std::time::Instant::now();
                for node in &backward_nodes {
                    node.backward();
                }
                let elapsed = start.elapsed();
                println!("Mean Backward Time: {} milliseconds", elapsed.as_millis());
            }
        }
    }
    

    The example is quite trivial, using only Addition nodes, but considering the depth of the graph (1024 nodes) and the size of the tensors (1 million elements each), the benchmarks don't disappoint me, but I warmly invite you to try them out and to raise any possible ideas or objections that cross your mind.

    The resulting benchmark times are the following:

    | | Forward | Backward | |:-------------------------------:|:----------------:|:------------------:| | Iteration Mean Time | 156 μs | 2 ms | | Total Mean Time | 156 ms | 1'500 ms |

    Considering also that, for the .backward() case, each node is connected to other 4 sources nodes of 1 million entries each.

    enhancement urgent 
    opened by ste-pac 4
  • Investigate wether it is possible to do a rework of the mat-mat, mat-vec and vec-vec muls

    Investigate wether it is possible to do a rework of the mat-mat, mat-vec and vec-vec muls

    We should consider the possibility of finding a more elegant solution.

    The key issue here is that, supposing that we had only a method dot to perform the aforementioned operations, we would need to create the appropriate computational graph's node whilst having only one method defined on GraphBuilder. This can't be done as the type returned by that method cannot vary accordingly to the dimension of the operands. @ste-pac what do you think?

    enhancement help wanted later time 
    opened by frjnn 4
  • Conv2d: Dimension bug in Conv2d::forward

    Conv2d: Dimension bug in Conv2d::forward

    Using the latest commit on main (ca97fbecbb4e1aec0fbad05494dad09bce3040af), the code shown below has unexpected behavior.

    It passes a 4 dimentionnal tensor through several Conv2D layers, but output dimensions differs from expected; the last dimension (height) of tensor seems capped by the 2nd dimension (number of channels) of tensor.

    The programs then panics when actually trying to forward through the node graph, at line 83 of the example below: thread 'main' panicked at 'ndarray: could not broadcast array from shape: [5] to: [10, 5, 8, 8]'

    The code, ran with rust = 1.57.0 and ndarray = 0.15.4:

    use ndarray::{Ix4};
    use neuronika::{Backward, Data, Forward, Gradient, Overwrite, VarDiff, Convolve, zeros};
    use neuronika::nn::{ModelStatus, Conv2d, Learnable};
    use neuronika::nn;
    
    // Network definition.
    struct NeuralNetwork {
    	con1: Conv2d<nn::Zero>,
    	con2: Conv2d<nn::Zero>,
    	con3: Conv2d<nn::Zero>,
    	con4: Conv2d<nn::Zero>,
    	con5: Conv2d<nn::Zero>,
    	status: ModelStatus,     
    }
    
    impl NeuralNetwork {
    	fn new() -> Self {
    		let mut status = ModelStatus::default();
    
    		let con1 = status.register(
    			Conv2d::new(5, 5, (3, 3), (0, 0), nn::Zero, (1, 1), (1, 1))
    		);
    		let con2 = status.register(
    			Conv2d::new(5, 5, (3, 3), (0, 0), nn::Zero, (1, 1), (1, 1))
    		);
    		let con3 = status.register(
    			Conv2d::new(5, 5, (3, 3), (0, 0), nn::Zero, (1, 1), (1, 1))
    		);
    		let con4 = status.register(
    			Conv2d::new(5, 5, (3, 3), (0, 0), nn::Zero, (1, 1), (1, 1))
    		);
    		let con5 = status.register(
    			Conv2d::new(5, 4, (2, 2), (0, 0), nn::Zero, (1, 1), (1, 1))
    		);
    		
    		Self {
    			con1: con1,
    			con2: con2,
    			con3: con3,
    			con4: con4,
    			con5: con5,
    			status,
    		}
    	}
    
        fn forward<I, T, U>(
            &self,
            input: I,
        ) -> VarDiff<impl Data<Dim = Ix4> + Forward, impl Gradient<Dim = Ix4> + Overwrite + Backward>
        where
    		I: Convolve<I, Learnable<Ix4>, nn::Zero>,
    		I::Output: Into<VarDiff<T, U>>,
    		T: Data<Dim = Ix4>,
    		U: Gradient<Dim = Ix4> + Overwrite,
        {
    		let out1 = self.con1.forward(input).relu();
    		println!("{:?}", out1.data().dim()); // expected (10, 5, 8, 8), got (10, 5, 8, 8)
    		let out2 = self.con2.forward(out1).relu();
    		println!("{:?}", out2.data().dim()); // expected (10, 5, 6, 6), got (10, 5, 6, 6)
    		let out3 = self.con3.forward(out2).relu();
    		println!("{:?}", out3.data().dim()); // expected (10, 5, 4, 4), got (10, 5, 6, 5)
    		let out4 = self.con4.forward(out3).relu();
    		println!("{:?}", out4.data().dim()); // expected (10, 5, 2, 2), got (10, 5, 2, 5)
    		let out5 = self.con5.forward(out4).relu();
    		println!("{:?}", out5.data().dim()); // expected (10, 5, 1, 1), got (10, 5, 1, 4)
    		
    		out5
        }
    }
    
    fn main() {
    	let batch_size = 10;
    	let channels = 5;
    	let width = 10;
    	let height = 10;
        
    	let data = zeros((batch_size, channels, width, height));
    	
    	let net = NeuralNetwork::new();
    	let result = net.forward(data);
    	result.forward();
    }
    
    
    opened by lodykas 3
  • Example / quickstart fails to compile

    Example / quickstart fails to compile

    rustup 1.24.3 (ce5817a94 2021-05-31)
    rustc 1.58.1 (db9d1b20b 2022-01-20)
    cargo 1.58.0 (f01b232bc 2022-01-19)
    

    Two issues: first, cargo build --examples doesn't work because the label in Cargo.toml is [[example]] instead of [[examples]] as it should be. Fixing that manually leads to the following errors:

    $ cargo build --examples
    warning: unused manifest key: examples
       Compiling neuronika v0.2.0 (/home/john/umbc/git/neuronika)
    warning: unused import: `self`
     --> examples/quickstart.rs:4:10
      |
    4 |     nn::{self, loss, Learnable, Linear, ModelStatus},
      |          ^^^^
      |
      = note: `#[warn(unused_imports)]` on by default
    
    error[E0277]: the trait bound `NeuralNetwork: serde::de::Deserialize<'_>` is not satisfied
        --> examples/quickstart.rs:56:9
         |
    56   |     } = serde_json::from_str(
         |         ^^^^^^^^^^^^^^^^^^^^ the trait `serde::de::Deserialize<'_>` is not implemented for `NeuralNetwork`
         |
    note: required by a bound in `serde_json::from_str`
        --> /home/john/.cargo/registry/src/github.com-1ecc6299db9ec823/serde_json-1.0.78/src/de.rs:2611:8
         |
    2611 |     T: de::Deserialize<'a>,
         |        ^^^^^^^^^^^^^^^^^^^ required by this bound in `serde_json::from_str`
    
    For more information about this error, try `rustc --explain E0277`.
    warning: `neuronika` (example "quickstart") generated 1 warning
    error: could not compile `neuronika` due to previous error; 1 warning emitted
    

    I'm super excited for this library and it's chance to build portable, cross-platform binaries that can train RNNs, CNNs, and feed-forward networks.

    opened by clemej 2
  • A few questions

    A few questions

    Hi folks,

    First of all I would like to say what impressive work so far! I am a person with engineering and research background (ML/DL). I have been using PyTorch and TensorFlow for a while now, and I must say that the architecture is similar and feels like a smooth transition from PyTorch to neuronika, which is amazing.

    I am a bit familiar with the Julia ecosystem too, specially Flux, Zygote and JuliaGPU. This ecosystem has a few points that might be interesting to discuss:

    1. Abstraction over the autodiff machinery
    2. Abstraction over devices
    3. (Future abstraction over operations/devices using KernelAbstractions)

    What you folks think about having a small abstraction over the choice of the autodiff (separating from the framework itself)? How would Rust play along with GPUs (Nvidia, AMD)? For instance, JuliaGPU has a compiler and a GC to deal with allocation of memory in the device. Would it be easy to compile Rust-y kernels directly to the GPU?

    Best wishes,

    opened by darleybarreto 2
  • Model selection

    Model selection

    We've finally reached a good point where we can move the first steps toward supporting model selection

    Cookbook

    The following is a list of things that we aim to provide

    • [x] Cross Validation
      • [x] KFold
      • [x] StratifiedKFold

    Please, change the previous list when it's needed

    opened by ste-pac 2
  • rust stable 1.58 cannot compile the projects

    rust stable 1.58 cannot compile the projects

    "C:/Program Files (x86)/Rust stable GNU 1.58/bin/cargo.exe" build --color=always --message-format=json-diagnostic-rendered-ansi --all --all-targets error: failed to load manifest for workspace member G:\Codebase\Rust\neuronika-main\neuronika-variable Caused by: failed to parse manifest at G:\Codebase\Rust\neuronika-main\neuronika-variable\Cargo.toml Caused by: namespaced features with the dep: prefix are only allowed on the nightly channel and requires the -Z namespaced-features flag on the command-line Process finished with exit code 101

    opened by miaomiao1992 1
  • Hyperlinks are outdated in Codebase structure section of CONTRIBUTING.md file

    Hyperlinks are outdated in Codebase structure section of CONTRIBUTING.md file

    opened by akhildevelops 2
  • quickstart.rs example is outdated

    quickstart.rs example is outdated

    There are compilation errors while running quickstart.rs example.

    error[E0432]: unresolved imports `neuronika::nn::loss`, `neuronika::nn::Learnable`, `neuronika::nn::ModelStatus`, `neuronika::Data`, `neuronika::Gradient`, `neuronika::MatMatMulT`, `neuronika::Param`
     --> examples/quickstart.rs:4:10
      |
    4 |     nn::{loss, Learnable, Linear, ModelStatus},
      |          ^^^^  ^^^^^^^^^          ^^^^^^^^^^^ no `ModelStatus` in `nn`
      |          |     |
      |          |     no `Learnable` in `nn`
      |          no `loss` in `nn`
    5 |     optim, Data, Gradient, MatMatMulT, Param, VarDiff,
      |            ^^^^  ^^^^^^^^  ^^^^^^^^^^  ^^^^^ no `Param` in the root
      |            |     |         |
      |            |     |         no `MatMatMulT` in the root
      |            |     no `Gradient` in the root
      |            no `Data` in the root
      |            help: a similar name exists in the module: `data`
    
    error[E0432]: unresolved import `ndarray`
     --> examples/quickstart.rs:1:5
      |
    1 | use ndarray::Ix2;
      |     ^^^^^^^ use of undeclared crate or module `ndarray`
    
    error[E0433]: failed to resolve: use of undeclared crate or module `serde`
     --> examples/quickstart.rs:8:42
      |
    8 | #[cfg_attr(feature = "serialize", derive(serde::Serialize, serde::Deserialize))]
      |                                          ^^^^^ use of undeclared crate or module `serde`
    
    error[E0433]: failed to resolve: use of undeclared crate or module `serde`
     --> examples/quickstart.rs:8:60
      |
    8 | #[cfg_attr(feature = "serialize", derive(serde::Serialize, serde::Deserialize))]
      |                                                            ^^^^^ use of undeclared crate or module `serde`
    
    error: cannot find attribute `serde` in this scope
      --> examples/quickstart.rs:13:39
       |
    13 |     #[cfg_attr(feature = "serialize", serde(skip))]
       |                                       ^^^^^
    
    error[E0433]: failed to resolve: could not find `SGD` in `optim`
       --> examples/quickstart.rs:209:28
        |
    209 |     let optimizer = optim::SGD::new(model.parameters(), 0.01, optim::L2::new(0.0));
        |                            ^^^ could not find `SGD` in `optim`
    
    Some errors have detailed explanations: E0432, E0433.
    For more information about an error, try `rustc --explain E0432`.
    error: could not compile `neuronika` due to 6 previous errors
    
    opened by akhildevelops 0
  • Forward compat change to prep for breakage from a soundness fix

    Forward compat change to prep for breakage from a soundness fix

    In https://github.com/rust-lang/rust/pull/95474#issuecomment-1228705007 we are attempting to deploy a soundness bug fix for rare cases of impl Trait causing UB. While your crate does not use this unsoundly, it requires an additional annotation in order for rustc to be able to figure out that your case is actually sound.

    This change was done by following the diagnostics emitted by the rustc that contains the bugfix.

    opened by oli-obk 3
  • Does it support python programming environment ?

    Does it support python programming environment ?

    Will it be published to the pypi community for Python users in the future? Because now many people study deep learning in Python programming environment.

    opened by linjing-lab 1
  • Unable to compile example in stand-alone repository using main due to Serde error

    Unable to compile example in stand-alone repository using main due to Serde error

    Hello,

    I tried building the quickstart example in a stand-alone project today, and was unable to do so. I've tested that cloning neuronika off the main branch builds (and the example runs), but when pulling it in as a dependency, I am consistently running into the error

    error[E0277]: the trait bound `NeuralNetwork: serde::de::Deserialize<'_>` is not satisfied
        --> src/main.rs:53:9
         |
    53   |     } = serde_json::from_str(
         |         ^^^^^^^^^^^^^^^^^^^^ the trait `serde::de::Deserialize<'_>` is not implemented for `NeuralNetwork`
         |
    note: required by a bound in `serde_json::from_str`
        --> /home/chrism/.cargo/registry/src/github.com-1ecc6299db9ec823/serde_json-1.0.79/src/de.rs:2611:8
         |
    2611 |     T: de::Deserialize<'a>,
         |        ^^^^^^^^^^^^^^^^^^^ required by this bound in `serde_json::from_str`
    
    For more information about this error, try `rustc --explain E0277`.
    error: could not compile `neuronika_example` due to previous error
    

    Running this code with 0.2 doesn't seem feasible due to what I'm assuming are updated trait bounds around Overwrite.

    My Cargo.toml file has the following configuration, based on neuronika main's:

    [dependencies]
    neuronika = {git = "https://github.com/neuronika/neuronika.git", branch = "main", features = ["serialize"]}
    ndarray = {version = "0.15.4", features = ["rayon"]}
    serde_json = "1.0.72"
    serde = {version = "1.0.130", features = ["derive"]}
    

    I am compiling with rustc 1.60.0-stable on an x86 platform using Pop!_OS 21.04.

    I'm very interested in this library, and would love to experiment with starting to solve basic classification problem like MNIST or CIFAR-10 with it. Any advice on how I might be able to resolve these issues would be greatly appreciated. Thanks for your time!

    opened by quietlychris 14
Neural networks in Rust

deeplearn-rs Deep learning in Rust! This is my first shot at this. It's mostly just a proof of concept right now. The API will change. Status We have

Theodore DeRego 199 Oct 23, 2022
Compile-time creation of neural networks with Rust

GAMMA Compile-time creation of neural networks with Rust Description This is for now just a showcase project of what can be done with const generics i

Aitor Ruano 354 Jan 1, 2023
Neural Networks in Rust, without backpropagation. WIP

Deep Thought As of right now, this crate is far from a usable state. This crate implements feedforward-neural Networks in rust. Unlike the vast majori

null 5 Apr 10, 2022
Toy library for neural networks in Rust using Vulkan compute shaders

descent Toy library for neural networks in Rust using Vulkan compute shaders. Features Multi-dimensional arrays backed by Vulkan device memory Use Rus

Simon Brown 71 Dec 16, 2022
Compile-time creation of neural networks

Mushin: Compile-time creation of neural networks Mushin is a Japanese term used in martial arts that refers to the state of mind obtained by practice.

Aitor Ruano 354 Jan 1, 2023
A neural network, and tensor dynamic automatic differentiation implementation for Rust.

Corgi A neural network, and tensor dynamic automatic differentiation implementation for Rust. BLAS The BLAS feature can be enabled, and requires CBLAS

Patrick Song 20 Nov 7, 2022
Tensors and differentiable operations (like TensorFlow) in Rust

autograd Differentiable operations and tensors backed by ndarray. Motivation Machine learning is one of the field where Rust lagging behind other lang

Ryo ASAKURA 403 Dec 25, 2022
🔭 interactively explore `onnx` networks in your CLI.

nnli Interactively explore onnx networks in your CLI. Get nnli ?? From Cargo cargo install nnli From Github git clone https://github.com/drbh/nnli.git

drbh 18 Nov 27, 2023
A light wheight Neural Network library with a focus on ease of use and speed.

Smarty Pants This goal of this library is to: Produce NeuralNetworks that will always give the same result when given the same input. Provide methods

Coding Wizard 3 Mar 7, 2022
Simple neural network library for classification written in Rust.

Cogent A note I continue working on GPU stuff, I've made some interesting things there, but ultimately it made me realise this is far too monumental a

Jonathan Woollett-Light 41 Dec 25, 2022
Rust wrapper for the Fast Artificial Neural Network library

fann-rs Rust wrapper for the Fast Artificial Neural Network (FANN) library. This crate provides a safe interface to FANN on top of the low-level bindi

Andreas Fackler 12 Jul 17, 2022
Simple Neural Network on rust

Simple Artificial Neural Network A crate that implements simple usage of dense neural networks. Instalation Add this to your dependencies on Cargo.tom

null 6 Jul 1, 2022
Machine learning Neural Network in Rust

vinyana vinyana - stands for mind in pali language. Goal To implement a simple Neural Network Library in order to understand the maths behind it. This

Alexandru Olaru 3 Dec 26, 2022
A gpu accelerated (optional) neural network Rust crate.

Intricate A GPU accelerated library that creates/trains/runs neural networks in pure safe Rust code. Architechture overview Intricate has a layout ver

Gabriel Miranda 11 Dec 26, 2022
Neural network implementations from scratch in Rust.

Neural Network from Scratch Neural network implementations from scratch in Rust. Setup & Run Dataset used is mnist. Download the 4 archives and extrac

Mohammad Rahhal 6 Dec 29, 2022
A fun, hackable, GPU-accelerated, neural network library in Rust, written by an idiot

Tensorken: A Fun, Hackable, GPU-Accelerated, Neural Network library in Rust, Written by an Idiot (work in progress) Understanding deep learning from t

Kurt Schelfthout 44 May 6, 2023
An experimental Neural Network trainer/visualizer in Rust

DeepRender An experimental Neural Network trainer/visualizer in Rust Try it on your browser! https://msakuta.github.io/DeepRender/ Training on a funct

Masahiro Sakuta 6 Jun 12, 2023
A neural network crate

RustNN An easy to use neural network library written in Rust. Crate Documentation Description RustNN is a feedforward neural network library. The libr

Jack Montgomery 316 Dec 29, 2022
A simple neural net implementation.

PROPHET - Neural Network Library Linux Windows Codecov Coveralls Docs Crates.io A simple neural net implementation written in Rust with a focus on cac

Robin Freyler 41 Sep 16, 2022