Scientific Computing Library in Rust

In Digits

Last update: Dec 16, 2022

Related tags

Computation scirust

Overview

SciRust

Scientific computing library written in Rust programming language.

The objective is to design a generic library which can be used as a backbone for scientific computing.

Current emphasis is less on performance and more on providing a comprehensive API.

Current areas of focus

Fundamental algebraic structures
Matrices
Linear algebra
Statistics
Signal processing

A discussion group is setup at SciRust Google Group.

Features

General

Pure Rust implementation
Focus on generic programming
Extensive unit tests for all features
Column major implementation

Matrices

Generic matrix class supporting various data-types (u8, i8, u16, i16, ... , f32, f64, Complex32, Complex64)
Views over parts of matrices
Comprehensive support for operations on matrices.
Views over sub-matrices with similar operations.
Special support for triangular matrices.

Linear algebra

Solving systems of linear equations
LDU factorization
Rank, Determinant, Inverse

About Rust and Building the project

If you are unfamiliar with Rust, you are recommended to go through The Rust Programming Language Book.

The library can be built and used using Cargo which is the official dependency management and build tool for Rust.

Working with matrices requires a lot of low level code. As a user of the library, we expect that you won't have to write the low level code yourself. If you are reading or debugging through the source code of the library, you would see a lot of low level code. Good knowledge of Rust help you sail through them. Check out topics like:

The library code is full of unit tests. These unit tests serve multiple purposes

Making sure that the functions work as advertised.
Extensively testing those functions which use unsafe and low level features of Rust.
Learning about how to use the library features.

If you haven't read already, please familiarize yourself with Unit Testing in Rust. Writing unit tests will help you write better and more reliable code.

Comments

[MRG?] Access matrices through iterators where possible: the case of the major diagonal.
Not ready for merge. Started from changing implementation of trace to use uniform access to the diagonal elements. Current plan is:

Separate the access loop into its own function.

Put the function interface into a trait function. The "Shape" trait seems like the right place.

Move trace functionality into a trait... maybe NumberMatrix?

If this sounds ok, I'll do those, then generalize to other functionality.
opened by daniel-vainsencher 18
updating to build with newest rust

I'm new to rust but not scientific computing and I wanted to use this library. It wouldn't build so I forked it and went through many of the build errors and warnings trying to shut them up. It still does not compile but it is much further along than when I started. The first commit should be good but the second one had a bunch of find and replace stuff (in a directory) which usually gets messy.

opened by waynenilsen 10
Block multiplication of matrices.

I've wrote a function multiplying matrices by blocks. It should to be faster on larger matrices due to cache related issues of the simple multiplication function. It doesn't change the memory layout of the matrix.

I've tested the performance of both the simple (previous) and block multiplication of square matrices. For block size of 16 (it performed best on my computer) the block multiplication is outperforming the simple multiplication when matrix size is 32x32 and when the matrix size is 256x256 it's twice as fast as simple multiplication. I've copied the bench output at the end of this post.

PS: This is my first pull request, so i doubt it's well-formed. I wasn't sure if I had to raise an issue before making a pullrequest . Also I'm not sure if the code is readable enough.

This is the bench output on my Intel i7-3740QM PC. test matrix::transpose::matrix_transpose::bench::bench_gram_optimized ... bench: 3,174,322 ns/iter (+/- 169,171) test matrix::transpose::matrix_transpose::bench::bench_gram_simple ... bench: 6,123,006 ns/iter (+/- 241,711) test matrix::transpose::matrix_transpose::bench::bench_multiply_0001_block ... bench: 273 ns/iter (+/- 78) test matrix::transpose::matrix_transpose::bench::bench_multiply_0001_simple ... bench: 32 ns/iter (+/- 2) test matrix::transpose::matrix_transpose::bench::bench_multiply_0002_block ... bench: 292 ns/iter (+/- 6) test matrix::transpose::matrix_transpose::bench::bench_multiply_0002_simple ... bench: 45 ns/iter (+/- 7) test matrix::transpose::matrix_transpose::bench::bench_multiply_0004_block ... bench: 378 ns/iter (+/- 11) test matrix::transpose::matrix_transpose::bench::bench_multiply_0004_simple ... bench: 99 ns/iter (+/- 14) test matrix::transpose::matrix_transpose::bench::bench_multiply_0008_block ... bench: 793 ns/iter (+/- 160) test matrix::transpose::matrix_transpose::bench::bench_multiply_0008_simple ... bench: 444 ns/iter (+/- 64) test matrix::transpose::matrix_transpose::bench::bench_multiply_0016_block ... bench: 3,378 ns/iter (+/- 274) test matrix::transpose::matrix_transpose::bench::bench_multiply_0016_simple ... bench: 2,602 ns/iter (+/- 174) test matrix::transpose::matrix_transpose::bench::bench_multiply_0032_block ... bench: 25,345 ns/iter (+/- 1,643) test matrix::transpose::matrix_transpose::bench::bench_multiply_0032_simple ... bench: 28,967 ns/iter (+/- 813) test matrix::transpose::matrix_transpose::bench::bench_multiply_0064_block ... bench: 192,730 ns/iter (+/- 23,921) test matrix::transpose::matrix_transpose::bench::bench_multiply_0064_simple ... bench: 242,750 ns/iter (+/- 4,406) test matrix::transpose::matrix_transpose::bench::bench_multiply_0128_block ... bench: 1,559,674 ns/iter (+/- 92,525) test matrix::transpose::matrix_transpose::bench::bench_multiply_0128_simple ... bench: 2,045,354 ns/iter (+/- 245,310) test matrix::transpose::matrix_transpose::bench::bench_multiply_0256_block ... bench: 12,406,499 ns/iter (+/- 498,650) test matrix::transpose::matrix_transpose::bench::bench_multiply_0256_simple ... bench: 27,055,370 ns/iter (+/- 753,259) test matrix::transpose::matrix_transpose::bench::bench_multiply_0512_block ... bench: 101,763,380 ns/iter (+/- 10,103,632) test matrix::transpose::matrix_transpose::bench::bench_multiply_0512_simple ... bench: 227,811,582 ns/iter (+/- 21,193,979)

opened by dbalchev 4

Cannot compile scirust on Linux x64 (tried both stable and nightly)

OS: Archlinux x64 rustc: 1.16.0 (stable & nightly) scirust: 0.0.4 & 0.0.5

error message

error[E0432]: unresolved import `std::rt::heap::allocate`
  --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/matrix/matrix.rs:12:21
   |
12 | use std::rt::heap::{allocate, deallocate};
   |                     ^^^^^^^^ Could not find `heap` in `rt`

error[E0432]: unresolved import `std::rt::heap::deallocate`
  --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/matrix/matrix.rs:12:31
   |
12 | use std::rt::heap::{allocate, deallocate};
   |                               ^^^^^^^^^^ Could not find `heap` in `rt`

error[E0432]: unresolved import `std::raw::Slice`
  --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/matrix/matrix.rs:14:5
   |
14 | use std::raw::Slice;
   |     ^^^^^^^^^^^^^^^ no `Slice` in `raw`

error[E0432]: unresolved import `std::rt::heap::allocate`
 --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/matrix/triangular_matrix.rs:9:5
  |
9 | use std::rt::heap::allocate;
  |     ^^^^^^^^^^^^^^^^^^^^^^^ Could not find `heap` in `rt`

error[E0432]: unresolved import `std::rt::heap::deallocate`
 --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/util/memory.rs:4:21
  |
4 | use std::rt::heap::{deallocate};
  |                     ^^^^^^^^^^ Could not find `heap` in `rt`

error[E0554]: #[feature] may not be used on the stable release channel
 --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/lib.rs:2:1
  |
2 | #![feature(raw)]
  | ^^^^^^^^^^^^^^^^

error[E0554]: #[feature] may not be used on the stable release channel
 --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/lib.rs:3:1
  |
3 | #![feature(heap_api)]
  | ^^^^^^^^^^^^^^^^^^^^^

error[E0554]: #[feature] may not be used on the stable release channel
 --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/lib.rs:4:1
  |
4 | #![feature(ptr_as_ref)]
  | ^^^^^^^^^^^^^^^^^^^^^^^

error[E0554]: #[feature] may not be used on the stable release channel
  --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/lib.rs:31:2
   |
31 |  #![feature(core)]
   |  ^^^^^^^^^^^^^^^^^

error[E0554]: #[feature] may not be used on the stable release channel
  --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/lib.rs:32:1
   |
32 | #![feature(alloc)]
   | ^^^^^^^^^^^^^^^^^^

error[E0554]: #[feature] may not be used on the stable release channel
  --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/lib.rs:33:1
   |
33 | #![feature(step_by)]
   | ^^^^^^^^^^^^^^^^^^^^

error[E0554]: #[feature] may not be used on the stable release channel
  --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/lib.rs:34:1
   |
34 | #![feature(convert)]
   | ^^^^^^^^^^^^^^^^^^^^

error[E0554]: #[feature] may not be used on the stable release channel
  --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/lib.rs:35:1
   |
35 | #![feature(test)]
   | ^^^^^^^^^^^^^^^^^

error[E0554]: #[feature] may not be used on the stable release channel
  --> /home/boss/.cargo/registry/src/github.com-1ecc6299db9ec823/scirust-0.0.5/src/lib.rs:36:1
   |
36 | #![feature(associated_type_defaults)]
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to 9 previous errors

error: Could not compile `scirust`.

UPD: build fails with the package version, published on crates.io; git version compiles successfully

opened by zkbpkp 1

strange line in README

NOTE: The library currently doesn't build against the stable release of Rust. It builds against the nightly release of Rust. This scenario is likely to stay till Q3 2015.

Today is 2017, so it looks like (at least) this line must be corrected?

opened by magicgoose 1
Fix compilation on the latest nightly.
For reference:

$ rustc --version rustc 1.5.0-nightly (cff041170 2015-09-17)

Fixes compilation errors caused by the removal of std::rt::heap.

Due to RFC 1214 this then spits out many warnings. Initially these were simple to fix (you just need to be more explicit with your trait bounds) but in order to fix all the warnings you need to implement Copy for Matrix which you obviously can't do since it implements Drop. For that reason I didn't fix any of the warnings, only the errors.
opened by Sean1708 1
Add a Gitter chat badge to README.md

indigits/scirust now has a Chat Room on Gitter

@shailesh1729 has just created a chat room. You can visit it here: https://gitter.im/indigits/scirust.

This pull-request adds this badge to your README.md:

If my aim is a little off, please let me know.

Happy chatting.

PS: Click here if you would prefer not to receive automatic pull-requests from Gitter in future.

opened by gitter-badger 1
Enable features ptr_as_ref, heap_api, raw, to allow build under nightly 1.3

Thanks to /u/annodomini [1] for explaining what is happening.

Daniel

https://www.reddit.com/r/rust/comments/3d3dim/cant_build_scirust_ci_thinks_it_is_fine/

opened by daniel-vainsencher 1
Relicense under dual MIT/Apache-2.0
This issue was automatically generated. Feel free to close without ceremony if you do not agree with re-licensing or if it is not possible for other reasons. Respond to @cmr with any questions or concerns, or pop over to #rust-offtopic on IRC to discuss.

You're receiving this because someone (perhaps the project maintainer) published a crates.io package with the license as "MIT" xor "Apache-2.0" and the repository field pointing here.

TL;DR the Rust ecosystem is largely Apache-2.0. Being available under that license is good for interoperation. The MIT license as an add-on can be nice for GPLv2 projects to use your code.

Why?

The MIT license requires reproducing countless copies of the same copyright header with different names in the copyright field, for every MIT library in use. The Apache license does not have this drawback. However, this is not the primary motivation for me creating these issues. The Apache license also has protections from patent trolls and an explicit contribution licensing clause. However, the Apache license is incompatible with GPLv2. This is why Rust is dual-licensed as MIT/Apache (the "primary" license being Apache, MIT only for GPLv2 compat), and doing so would be wise for this project. This also makes this crate suitable for inclusion and unrestricted sharing in the Rust standard distribution and other projects using dual MIT/Apache, such as my personal ulterior motive, the Robigalia project.

Some ask, "Does this really apply to binary redistributions? Does MIT really require reproducing the whole thing?" I'm not a lawyer, and I can't give legal advice, but some Google Android apps include open source attributions using this interpretation. Others also agree with it. But, again, the copyright notice redistribution is not the primary motivation for the dual-licensing. It's stronger protections to licensees and better interoperation with the wider Rust ecosystem.

How?

To do this, get explicit approval from each contributor of copyrightable work (as not all contributions qualify for copyright, due to not being a "creative work", e.g. a typo fix) and then add the following to your README:

## License Licensed under either of * Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0) * MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT) at your option. ### Contribution Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

and in your license headers, if you have them, use the following boilerplate (based on that used in Rust):

// Copyright 2016 scirust developers // // Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or // http://www.apache.org/licenses/LICENSE-2.0> or the MIT license // <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your // option. This file may not be copied, modified, or distributed // except according to those terms.

It's commonly asked whether license headers are required. I'm not comfortable making an official recommendation either way, but the Apache license recommends it in their appendix on how to use the license.

Be sure to add the relevant LICENSE-{MIT,APACHE} files. You can copy these from the Rust repo for a plain-text version.

And don't forget to update the license metadata in your Cargo.toml to:

license = "MIT/Apache-2.0"

I'll be going through projects which agree to be relicensed and have approval by the necessary contributors and doing this changes, so feel free to leave the heavy lifting to me!

Contributor checkoff

To agree to relicensing, comment with :

I license past and future contributions under the dual MIT/Apache-2.0 license, allowing licensees to chose either at their option.

Or, if you're a contributor, you can check the box in this repo next to your name. My scripts will pick this exact phrase up and check your checkbox, but I'll come through and manually review this issue later as well.

[ ] @shailesh1729

[ ] @daniel-vainsencher

[ ] @achanda

[x] @dbalchev

[ ] @Sean1708
opened by emberian 3
RFC: a convention for error handling
This RFC proposes a convention for structuring methods in SciRust which can cater to the conflicting needs of efficiency, easy of use and effective error handling.

For the impatient:

// Efficient access without bound checks unsafe fn get_unchecked(&self, r : usize, c : usize) -> T; // Safe access with bound checks, raises error if invalid address fn get_checked(&self, r : usize, c : usize) -> Result<T, Error>; // User friendly version. Panics in case of error fn get(&self, r : usize, c : usize) -> T; // Efficient modification without bound checks unsafe fn set_unchecked(&mut self, r : usize, c : usize, value : T); // Safe modification with bound check fn set(&mut self, r : usize, c : usize, value : T);

Detailed discussion

The audience of SciRust can be possibly divided into two usage scenarios.

A script style usage, where the objective is to quickly do some numerical experiment, get the results and analyze them.

A library development usage, where more professional libraries would be built on top of fundamental building blocks provided by SciRust (these may be other modules shipped in SciRust itself).

While the first usage scenario is important for getting new users hooked to the library, the second usage scenario is also important for justifying why Rust should be used for scientific software development compared to other scientific computing platforms.

In context of the two usage scenarios, the design of SciRust has three conflicting goals:

Ease of use

Efficiency

Well managed error handling

While ease of use is important for script style usage, efficiency and well managed error handling are important for serious software development on top of core components provided by SciRust.

We will consider the example of a get(r,c) method on a matrix object to discuss these conflicting goals. Please note that get is just a representative method for this discussion. The design ideas can be applied in many different parts of SciRust once accepted.

If get is being called in a loop, usually the code around it can ensure that the conditions for accessing data within the boundary of the matrix are met correctly. Thus, a bound checking within the implementation of get is just an extra overhead.

While this design is good for writing efficient software, it can lead to a number of memory related bugs and goes against the fundamental philosophy of Rust (Safety first). There are actually two different options for error handling:

Returning either Option<T> or Result<T, Error>.

Using the panic mechanism.

Option<T> or Result<T, Error> provides the users a fine grained control over what to do when an error occurs. This is certainly the Rusty way of doing things. At the same time, both of these return types make the user code more complicated. One has to add extra calls to .unwrap() even if one is sure that the function is not going to fail.

Users of scientific computing tend to prefer an environment where they can get more work done with less effort. This is a reason of the success of specialized environments like MATLAB. Open source environments like Python (NumPy, SciPy) try to achieve something similar.

While SciRust doesn't intend to compete at the level of simplicity provided by MATLAB/Python environments, it does intend to take an extra effort wherever possible to address the ease of use goal.
In this context, the return type of a getter should be just the value type T. This can be achieved safely by using a panic if the access boundary conditions are not met.

The discussion above suggests up to 3 possible ways of implementing methods like get.

An unchecked (and unsafe) version for high efficiency code where the calling code is responsible for ensuring that the necessary requirements for correct execution of the method are being met.

A safe version which returns either Option<T> or Result<T, Error> which can be used for professional software development where the calling code has full control over error handling.

Another safe version which panics in case of error but provides an API which is simpler to use for writing short scientific computing scripts.

Proposed convention

We propose that a method for which these variations need to be supported, should follow the convention defined below:

A method_unchecked version should provide basic implementation of the method. This should assume that necessary conditions for successful execution of the methods are already being ensured by the calling code. The unchecked version of method MUST be marked unsafe. This ensures that the calling code knows that it is responsible for ensuring the right conditions for calling the unchecked method.

A method_checked version should be implemented on top of a method_unchecked method. The checked version should check for all the requirements for calling the method safely. The return type should be either Option<T> or Result<T, Error>. In case the required conditions for calling the method are not met, a None or Error should be returned. Once the required conditions are met, method_unchecked should be called to get the result which would be wrapped inside Option or Result.

A method version should be built on top of method_checked version. It should simply attempt to unwrap the value returned by method_checked and return as T. If method_checked returns an error or None, this version should panic.

First two versions are suitable for professional development where most of the time we need a safe API while at some times we need an unsafe API for efficient implementation. The third version is suitable for script style usage scenario.

The convention has been illustrated in the three versions of get at the beginning of this document.

API bloat

While this convention is expected to lead into an API bloat, but if the convention is followed properly across the library, then it should be easy to follow (both from the perspective of users of the library and from the perspective of developers of the library).
RFC
opened by shailesh1729 15
[WIP] Creating matrix from iterator, padding should be explicit.

Currently, just removes default padding. Plan for implementing padding versions is by extending the iterator with zeros, then calling the same underlying method. I propose for the default to require the iterator to end exactly with the matrix capacity (with a utility version that truncates the iter).

opened by daniel-vainsencher 0
We shouldn't implicitly insert default values?
Matrix::from_iter* etc all fill out the rest of the matrix with zeros if they run out of values early. I have several problems with this:

It masks bugs; I would prefer a panic whenever we run out of values early.

It requires the Zero trait even when really should be just copying stuff around.

Relatively speaking, lots of code is spent on it.

I don't mind opening a PR if we agree on the change.
opened by daniel-vainsencher 4
Things I've had to look up starting to use SciRust
I've contributed a little code, so I'm not as fresh as a typical new user, but I've still had to hunt around for the follow as I tried to convert some numerical code to use SciRust. I think our goal should be to be friendly towards someone coming to do numerical code from Python or R or Matlab; the intersection of numerical developers and rust experts is pretty small.

Where is there a simple usage example? best I found is the tests in src/matrix/matrix.rs but the imports there look different.

extern crate scirust; // This is easy enough. But what else do I need? hmm, not recognizing Matrix.

use scirust::matrix::matrix::Matrix; // Ugh. Lots of repetition. The core datastructures should be easier to access than this. Maybe they are, but I didn't find the shortcut.

use scirust::matrix::traits::Shape; // Just to use num_rows, which is pretty basic. If I want to use matrices, I definitely want to know how many rows they have. This wasn't hard to find though, because rustc tells me.

Matrix::ones // I wanted to use this function, and it wasn't trivial. For example, adding a use statement for it ("use scirust::matrix::matrix::Matrix::ones;") gave me "error: ones is not directly importable [E0253]". Are there more friendly ways to expose this function? this is a very common function in my usage, and we should consider this as a UI issue.

use scirust::matrix::random::rand_std_normal; // Should have figured this one out faster by now, but I still doubt it needs to be this deep. Again, for me this is extremely common to want lots of Gaussians, almost like ones. Also, is there a logic to when we use "rand" vs. "random"?

cell_iter // Why is this not just plain iter? sounds reasonable to me that the elements of a matrix are the cells (though it is true that NumPy iterates over rows by default).

I was also surprised that X[i] actually compiles Is there no way to require two indices to access a matrix? would it be better to forgo indexing syntax, just to avoid the potential for bugs? or maybe implement Index for pair types?
opened by daniel-vainsencher 9