A Rust library for random number generation.

Last update: Jan 6, 2023

Related tags

Utilities rand

Overview

Rand

A Rust library for random number generation, featuring:

Easy random value generation and usage via the Rng, SliceRandom and IteratorRandom traits
Secure seeding via the getrandom crate and fast, convenient generation via thread_rng
A modular design built over rand_core (see the book)
Fast implementations of the best-in-class cryptographic and non-cryptographic generators
A flexible distributions module
Samplers for a large number of random number distributions via our own rand_distr and via the statrs
Portably reproducible output
#[no_std] compatibility (partial)
Many performance optimisations

It's also worth pointing out what rand is not:

Small. Most low-level crates are small, but the higher-level rand and rand_distr each contain a lot of functionality.
Simple (implementation). We have a strong focus on correctness, speed and flexibility, but not simplicity. If you prefer a small-and-simple library, there are alternatives including fastrand and oorandom.
Slow. We take performance seriously, with considerations also for set-up time of new distributions, commonly-used parameters, and parameters of the current sampler.

Documentation:

Usage

Add this to your Cargo.toml:

[dependencies]
rand = "0.8.4"

To get started using Rand, see The Book.

Versions

Rand is mature (suitable for general usage, with infrequent breaking releases which minimise breakage) but not yet at 1.0. We maintain compatibility with pinned versions of the Rust compiler (see below).

Current Rand versions are:

Version 0.7 was released in June 2019, moving most non-uniform distributions to an external crate, moving from_entropy to SeedableRng, and many small changes and fixes.
Version 0.8 was released in December 2020 with many small changes.

A detailed changelog is available for releases.

When upgrading to the next minor series (especially 0.4 → 0.5), we recommend reading the Upgrade Guide.

Rand has not yet reached 1.0 implying some breaking changes may arrive in the future (SemVer allows each 0.x.0 release to include breaking changes), but is considered mature: breaking changes are minimised and breaking releases are infrequent.

Rand libs have inter-dependencies and make use of the semver trick in order to make traits compatible across crate versions. (This is especially important for RngCore and SeedableRng.) A few crate releases are thus compatibility shims, depending on the next lib version (e.g. rand_core versions 0.2.2 and 0.3.1). This means, for example, that rand_core_0_4_0::SeedableRng and rand_core_0_3_0::SeedableRng are distinct, incompatible traits, which can cause build errors. Usually, running cargo update is enough to fix any issues.

Yanked versions

Some versions of Rand crates have been yanked ("unreleased"). Where this occurs, the crate's CHANGELOG should be updated with a rationale, and a search on the issue tracker with the keyword yank should uncover the motivation.

Rust version requirements

Since version 0.8, Rand requires Rustc version 1.36 or greater. Rand 0.7 requires Rustc 1.32 or greater while versions 0.5 require Rustc 1.22 or greater, and 0.4 and 0.3 (since approx. June 2017) require Rustc version 1.15 or greater. Subsets of the Rand code may work with older Rust versions, but this is not supported.

Continuous Integration (CI) will always test the minimum supported Rustc version (the MSRV). The current policy is that this can be updated in any Rand release if required, but the change must be noted in the changelog.

Crate Features

Rand is built with these features enabled by default:

std enables functionality dependent on the std lib
alloc (implied by std) enables functionality requiring an allocator
getrandom (implied by std) is an optional dependency providing the code behind rngs::OsRng
std_rng enables inclusion of StdRng, thread_rng and random (the latter two also require that std be enabled)

Optionally, the following dependencies can be enabled:

log enables logging via the log crate

Additionally, these features configure Rand:

small_rng enables inclusion of the SmallRng PRNG
nightly enables some optimizations requiring nightly Rust
simd_support (experimental) enables sampling of SIMD values (uniformly random SIMD integers and floats), requiring nightly Rust
min_const_gen enables generating random arrays of any size using min-const-generics, requiring Rust ≥ 1.51.

Note that nightly features are not stable and therefore not all library and compiler versions will be compatible. This is especially true of Rand's experimental simd_support feature.

Rand supports limited functionality in no_std mode (enabled via default-features = false). In this case, OsRng and from_entropy are unavailable (unless getrandom is enabled), large parts of seq are unavailable (unless alloc is enabled), and thread_rng and random are unavailable.

WASM support

The WASM target wasm32-unknown-unknown is not automatically supported by rand or getrandom. To solve this, either use a different target such as wasm32-wasi or add a direct dependency on getrandom with the js feature (if the target supports JavaScript). See getrandom#WebAssembly support.

License

Rand is distributed under the terms of both the MIT license and the Apache License (Version 2.0).

See LICENSE-APACHE and LICENSE-MIT, and COPYRIGHT for details.

Comments

Crate refactor
This is a rough preview of what these RFCs may result in:

Rand crate revision #2106

New rand-core crate #2152

Issue tracker

Current state in my repo

rustdoc for refactor

This is a major break-anything-I-like refactor of rand, in an attempt to fix a bunch of issues I and others have found. I don't expect it to be merged as-is, but hopefully a view of what rand could be will help answer questions about how it should be.

There are a few bits unfinished (see TODO and FIXME); also some changes from other PRs should be integrated. Possibly there are a couple of things which could be cleaned up further. But mostly this is ready for review now.

Comments welcome. Each commit should be fairly clean with a low number of changes, but put together there are a lot of changes.
opened by dhardy 89
Investigate SIMD support

Disclaimer: I don't really know what I am talking about here.

The first step, as far as Rand is concerned, is to generate a small array of values from an RNG. Some RNG's may lend itself well for this, like ChaCha, and possibly HC-128, although that one does not use SIMD. Other options are packing together multiple small RNGs, like Xorshift+ or Xoroshiro+. The result type of such an RNG should be some array, so maybe we can make use of the BlockRngCore trait?

Making use of the generated values can be a more interesting problem. This involves the distributions code.

Converting to floating point: the Standard uniform distribution should not cause any problems, and neither should the range code.

I suppose the other distributions are going to give trouble, at least anything that needs branches or a look-up table (i.e. the current ziggurat code) is not going to work well. And what should happen when an algorithm needs to do rejections sampling, and one of a couple of SIMD values get rejected? Return NaN? For most distributions there should exist an algorithm suitable for SIMD. The Box-Muller transform for example looks promising for the Normal distribution, although it would need a library to provide functions like sin and cos.

Integer ranges: I don't expect the widening multiply method to work for integer range reduction, and division or modulus also do not seem to be available. And I suppose we have no way to indicate a value is rejected because it falls outside the acceptable zone and can cause bias.

Now I don't know how far we should go, and if Rand should provide any implementations at all. But I think we should make sure the API does not prevent Rand from being used in this area.

opened by pitdicker 69
ThreadRng / EntropyRng improvements
Remove usage of Rc from ThreadRng (#463 and ??) — I think we only didn't do this before because of safety concerns, but (1) use of raw pointers automatically implies the type is neither send nor sync, (2) thread_local instantiates automatically so there is always a valid object and (3) none of the types involved implement Drop

Clean up EntropyRng a little (working towards #313)

Add set_custom_entropy function

~~Next step: allow custom entropy source. Unfortunately the EntropySource trait is still not good enough for run-time injection because the size & alignment of instances is unknown (it would be nice if this could be enforced somehow).~~
E-question
opened by dhardy 61

seq: use Floyd's combination algorithm to sample indices

Adapt #144 for sampling indices

This isn't the fully generic implementation that @burdges wrote; we could make this more generic. I'm guessing we could include an arbitrary lower bound without performance overhead (since the compiler can probably eliminate a constant of 0), and could also use generic typing.

I'm not really sure if it should be committed like this. It cleans up the code and improves performance in some cases, but isn't optimal in others. We could keep the old sample_indices_inplace and just replace sample_indices_cache.

# before
test gen_1k_sample_iter              ... bench:         390 ns/iter (+/- 4) = 2625 MB/s                                                
test misc_sample_indices_100_of_1k   ... bench:         806 ns/iter (+/- 8)                                                            
test misc_sample_indices_10_of_1k    ... bench:         594 ns/iter (+/- 9)                                                            
test misc_sample_indices_50_of_1k    ... bench:         472 ns/iter (+/- 8)                                                            
test misc_sample_iter_10_of_100      ... bench:       1,072 ns/iter (+/- 15)                                                           
test misc_sample_slice_10_of_100     ... bench:         160 ns/iter (+/- 2)                                                            
test misc_sample_slice_ref_10_of_100 ... bench:         160 ns/iter (+/- 2)                                                            
# after                                                                                                       
test gen_1k_sample_iter              ... bench:         391 ns/iter (+/- 27) = 2618 MB/s
test misc_sample_indices_100_of_1k   ... bench:       2,031 ns/iter (+/- 39)
test misc_sample_indices_10_of_1k    ... bench:          90 ns/iter (+/- 1)
test misc_sample_indices_50_of_1k    ... bench:         649 ns/iter (+/- 11)
test misc_sample_iter_10_of_100      ... bench:       1,083 ns/iter (+/- 16)
test misc_sample_slice_10_of_100     ... bench:         157 ns/iter (+/- 1)
test misc_sample_slice_ref_10_of_100 ... bench:         158 ns/iter (+/- 3)

Edit: now:

test misc_sample_indices_100_of_1k           ... bench:         515 ns/iter (+/- 24)
test misc_sample_indices_10_of_1k            ... bench:          86 ns/iter (+/- 3)
test seq_iter_choose_multiple_10_of_100      ... bench:       1,137 ns/iter (+/- 319)
test seq_iter_choose_multiple_fill_10_of_100 ... bench:       1,088 ns/iter (+/- 35)
test seq_slice_choose_multiple_10_of_100     ... bench:         139 ns/iter (+/- 8)

B-API T-sequences D-review

opened by dhardy 58

Tracker: planned changes for 0.5
To follow up on the release strategy in #205, here is a rough plan for Rand 0.5:

[x] Merge PRNG code improvements #209

[x] Add Hc128Rng #210

[x] Add new error types #225

[x] Add SeedableRng::from_rng to replace Rand impls on PRNGs; update seed type #233

[x] Rename IndependentSample → Distribution and remove Sample #256

[x] Replace Rand with Uniform distribution #256

[x] Split Rng to Rng: RngCore #265

[x] Move Rng etc. to new rand-core crate [requires SampleRng; ~#264~ #288]

[ ] ~Add HalfOpen01 distribution~

Related issues which need solving:

[x] Naming of try_fill_bytes: https://github.com/dhardy/rand/issues/27

[x] Name of "default" distribution: Uniform; https://github.com/dhardy/rand/issues/81

[x] Distribution names: https://github.com/dhardy/rand/issues/81

[x] Fate of Rand: https://github.com/dhardy/rand/issues/83

[x] Re-exporting: https://github.com/dhardy/rand/issues/15

[x] Decide which Distribution::sample prototype to use #287

[x] Remove default implementations for RngCore methods?

There are some extra changes which should be relatively easy to add after the above:

[ ] Add new floating-point sampling code #274 (partial)

[x] Add new Range code #274

[x] Add CryptoRng: RngCore extension trait #273

[x] Switch thread_rng() to HC-128? https://github.com/dhardy/rand/issues/53

[x] Optional logging support: https://github.com/dhardy/rand/issues/39, https://github.com/dhardy/rand/pull/43

[x] ~New iterator code~ #275 rejected; #286 merged

Desired additions:

[x] Merge #96 (requires testing/review)

[x] Add Bernoulli distribution and gen_bool then deprecate gen_weighted_bool (see #293)

Finally, a list of other planned changes, which probably won't make 0.5:

[x] Replace gen_ascii_chars with Alphanumeric distribution: #279

[ ] Revise character distributions (more classes, possible renaming): likely not desired

[ ] Add SeedableRng::from_hashable: https://github.com/dhardy/rand/issues/18, https://github.com/dhardy/rand/pull/62

[x] Change the RNG used by StdRng: https://github.com/dhardy/rand/issues/53

[ ] Change the RNG used by weak_rng: https://github.com/dhardy/rand/issues/60, https://github.com/dhardy/rand/issues/52

[ ] Publish one or more PRNG algorithm crates and remove XorShiftRng, IsaacRng, ChaChaRng, etc. from rand (don't export): https://github.com/dhardy/rand/issues/58

[ ] Revise WeightedChoice & sequences code: https://github.com/dhardy/rand/issues/82

[ ] Fork protection: https://github.com/dhardy/rand/issues/59

[x] Revise other Rng methods: gen_iter, gen_weighted_bool, gen_ascii_chars, choose, shuffle: #293, #286, #279

P-high X-tracker
opened by dhardy 50
OsRng run-time failure / the rand_os crate / modularity
Roughly, rand could be modularised as follows:

| component | dependencies | | -------------- | ------------------ | | rand_core | | | PRNG crates | core | | rand_os | core + std | | JitterRng | core + std | | EntropyRng | core + os + jitter + std | | ThreadRng | core + os + jitter + entropy + std + hc | | seq | core | | distributions | core | | Rng | core + seq + distributions |

Roughly 30% of the current code comes under non-linear distributions, which is its own story (#290). For the rest I don't think there's sufficient utility for more crates.

I regret accepting #643 as-is; it "promises" to support all std platforms yet fails on WASM with a compile error (#674, #678) and on any unsupported platforms. The reasons that is an issue are convoluted (probably involving users depending on rand with default features yet not needing them all), but the conclusion of #678 is that Rand should always build for wasm32-unknown-unknown, failing if necessary at run-time.

We used to use run-time failure within EntropyRng but disable OsRng at compile time where unneeded. Practical solutions are:

Restore the old behavior: rand_os must compile but without the OsRng struct on unsupported platforms. EntropyRng is currently available on all platforms but with run-time failure when no source is available, however it requires a complex cfg pattern in order to disable the missing OsRng dependency when not available; it would therefore make sense to have this in the same crate.

Enable OsRng on WASM, and use run-time failure on missing implementation.

Enable OsRng on all platforms, and use run-time failure on missing implementation.

I think the cleanest solution would be (1), but @tarcieri @naftulikay I understand there would be resistance to moving JitterRng into rand_os. Is this really justified? I guess in part this depends on whether the RFC proposed in #648 gets accepted, though this will be a while coming.

Otherwise we can do (2) or (3); the difference is that (2) would cause build failures on any platform where Rand is used with std/default features but OsRng is unavailable. I'm not sure whether this is a good idea?

I suppose (2) is most sensible now.

CC @newpavlov @burdges
opened by dhardy 48
Move PRNG algorithms out of Rand
Edit by dhardy: adding a tracker for the planned tasks:

[x] Replace SmallRng algorithm (see https://github.com/dhardy/rand/issues/60 and https://github.com/dhardy/rand/issues/52)

[x] Deprecate XorshiftRng (required: that the RNG is available in another crate)

[x] Move ISAAC* to new crate

[x] Publish PCG RNGs somewhere (rand_rngs, rand_pcg or something)

[ ] Publish SFC generators somewhere

[x] Review xoshiro crate by @vks

[ ] Review state of other published RNGs (see linked comment) and fix up if useful

[ ] Document available RNGs (internal and external crates)

Also: we should consider whether we want to rename PRNGs when moving or adding them.

As decided in https://github.com/dhardy/rand/issues/58.

The idea is to only keep the StdRng and SmallRng wrappers in Rand, and to not expose PRNG algorithms directly.

The field of PRNG algorithms is not static, it seems like a good idea to cut Rand partly loose from the developements to improve our backward-compatability and usability story. Advantages (from the original issue)

rand provide easy to reason generator(s) to satisfy the bulk of the users

rand provides low-friction upgrades and keeps the optimal-ish algorithms readily available

rand will not accumulate and/or deprecate specific algorithms throughout time

Specific rand Rng names won't bleed out to the world, like in forum posts, stackoverflow, etc..

Instead we want to provide an official companion crate, rand_rngs, with a couple of blessed implementations.

Whether Rand should depend on rand_rngs or copy the two algorithms used by StdRng and SmallRng is not fully decided. It might reduce code size if a crate depends on both StdRng and its algorithm (currently Hc128Rng) directly, but that seems rare and not worth much. It has the disadvantage that both implementations should remain in sync. Maybe the CI can check this?

If Rand depends on rand_rngs, it would restrict rand_rngs to only depend on rand_core. But that is what rand_core is for. I am for having Rand depend on rand_rngs.

rand_rngs

rand_rngs should provide a good selection of PRNGs.

Including multiple PRNGs in one crate has the advantage of having a single place to provide guidance, as it helps comparing different algorithms with each other. Also it gives one place to offer consistent benchmarks, to run PRNG test suites, to keep up a common quality level, and possibly to develop functionality that is useful for more than one PRNG (e.g. jumping).

Which PRNGs to include has been the topic of endless discussions. A few are basically decided. I fully expect the number of PRNGs to grow over time. But we also should be careful not to expose too many, as there are hundreds. Every PRNG should have one thing that gives it a clear advantage over others, otherwise it is not worth the inclusion.

Normal PRNGs

https://github.com/dhardy/rand/issues/60 explored normal PRNGs. The following 5 I feel comfortable about for an initial version of rand_rngs:

PCG-XSH-RR 64/32 (LCG)

PCG-XSL-RR 128/64 (MCG)

SFC (32-bit variant)

SFC (64-bit variant)

Xoroshiro128+

The two PCG variants offer good quality and reasonable performance. SFC provides high performance and a chaotic PRNG (not a fixed period). Xoroshiro128+ has acceptable quality but high performance.

An PRNG put together by me, XoroshiroMT, is also good quality and sits between PCG and Xoroshiro128+ qua performance. But now that there is just a new Xoshiro PRNG, it may be worth evaluating that one first, as it is said to also be good quality.

CSPRNGs

For CSPRNGs we currently have two good implementations in Rand

ChaCha20

HC-128

ChaCha20 offers reasonably good performance and uses little memory, while HC-128 brings high performance at the cost of using much memory.

I would also like to see some implementation of AES-CTR DRBG eventually, as it is commonly used, and has hardware support on modern desktop processors.

Other / deprecated PRNGs

We currently have the ISAAC, ISAAC-64 and Xorshift128/32 PRNGs in rand. They have no real advantages over the PRNGs metioned before. It is better to use HC-128 instead of ISAAC, and Xoroshiro128+ or PCG instead of Xorshift. We can include them in something like a deprecated module, but I propose to move them to stand-alone crates outside the Rand repository.

Steps to take

move the prngs module to a seperate rand_rngs crate in the Rand repository, similar to rand_core.

move generators benchmarks over.

lift PRNG implementations from https://github.com/pitdicker/small-rngs (I have updated it to master a while ago, but still have to push the changes)

lift Xoroshiro128+ from https://github.com/vks/xoroshiro/

Does this sound about right?
B-API P-high T-RNG
opened by pitdicker 47
Remove `weak_rng` and `random`?
Decision:

[ ] Remove random() function

[ ] Replace weak_rng() function with SmallRng wrapper

[ ] Change algorithm used by SmallRng (but off topic here, thus not a blocker for this issue)

Both the weak_rng and the random function are little helpers, but don't really do much. To be clear, in the next release users may simply:

replace random() with thread_rng().gen()

replace weak_rng() with XorShiftRng::new() (or a different PRNG)

The first is pure convenience (but also a minor anti-pattern since caching the result of thread_rng can improve performance).

The second has a small advantage: users may specify simply that they want a fast, weak PRNG, and don't mind if the implementation changes. XorShiftRng is reproducible, meaning that library updates won't change its output; weak_rng is not reproducible, meaning the library may replace the internals with better generators in the future.

Note that currently weak_rng() returns XorShiftRng which makes changing the former a breaking change; we should introduce a wrapper around the generator used (i.e. fn weak_rng() -> WeakRng) or drop the function and only use a wrapper type (WeakRng::new()).

Note also that thread_rng() is staying because it is widely used and has additional features.

We already wish to change the weak_rng algorithm and have two candidates (see https://github.com/dhardy/rand/issues/52 and https://github.com/dhardy/rand/issues/60), so it's possible we could have two variants, e.g. FastRng and SmallRng. This issue is not about which algorithm we choose.

remove the random() convenience function?

keep weak_rng() or rename or only keep wrapper types, or remove completely?

E-question B-API
opened by dhardy 47
Implement Bernoulli distribution
This just uses gen_bool, not a different implementation.

It is implemented for all primitive types. Not sure whether that makes sense. Maybe it is better to only generate bool and let the user do the conversion? Having Distribution<T> implemented for more than one T can make type annotations annoying. If we ever add impl Distribution<f32> for Normal, it will probably break a lot of code...

P-high B-value
opened by vks 45
Future of distributions in rand
When working with distributions it is very useful to have access to their density, their distribution function and their quantile function in addition to being able to sample from them. Furthermore, it is nice to be able to calculate their (theoretical, exact) moments. I think it makes sense to have all this functionality in a common interface.

This is implemented in the statrs crate. There is some overlap with rand, sampling is implemented there as well, but for a lot more distributions.

I see the following options:

Only implement sampling in rand. Port the missing distributions from statrs to rand.

Also implement the theoretical properties of the distributions mentioned above in rand, essentially duplicating statrs.

Remove the distributions from rand and suggest to use statrs instead.

What do you think?
E-question B-API T-distributions P-low
opened by vks 43
Add EntropySource wrapper
This is just an idea I am trying. The PR is based on top of https://github.com/rust-lang-nursery/rand/pull/233, only the last 3 commits are new.

I have added an EntropySource wrapper that uses OsRng, and if it errors falls back to JitterRng. I hope this will help in three situations:

The fallback mechanism in NewSeeded can also be used by user code.

It is no longer necessary to have a Reseeder trait, providing an RNG for reseeding is enough.

We can hopefully add a feature flag to let EntropySource use an external RNG. This will make it and NewSeeded available for no_std uses.
opened by pitdicker 42
Rework CryptoRng

This PR introduces the CryptoBlockRng marker trait. It's used instead of CryptoRng on block RNGs, which allows us to mark CryptoRng as a supertrait of RngCore and makes the CryptoRngCore trait redundant.

Additionally, try_fill_bytes is moved to CryptoRng and renamed to crypto_fill_bytes. The rationale here is that error checks for potential RNG failures are practically exclusive to cryptographic code.

Unfortunately, this PR also contains a bunch of formatting changes introduced by cargo fmt. I think it could be worth to include formatting check into our CI to prevent such changes in future.

cc @tarcieri
B-API

opened by newpavlov 5
Performance improvements for `shuffle` and `partial_shuffle`

This is Related to https://github.com/rust-random/rand/issues/1266 but completely orthogonal to https://github.com/rust-random/rand/pull/1268 which improves the performance of a different set of methods in a different way.

This improves the performance of SliceRandom::shuffle() and SliceRandom::partial_shuffle() by essentially batching the random number generation.

It seems to be about 50-100% faster for most slice lengths, with less performance improvement for longer slices. It will use the old method for slices with length longer than 2^32.

This is a value breaking change.

Benchmark results

Partial Shuffle

Partial shuffle half of the slice | Number of Elements | Rng | Old ns/iter | New ns/iter | Ratio (new:old) | | ------------------ | --------- | ----------- | ----------- | --------------- | | 10 | CryptoRng | 44 | 27 | 1.62962963 | | 10 | SmallRng | 30 | 21 | 1.428571429 | | 100 | CryptoRng | 409 | 164 | 2.493902439 | | 100 | SmallRng | 270 | 130 | 2.076923077 | | 1000 | CryptoRng | 3,616 | 1,993 | 1.814350226 | | 1000 | SmallRng | 2,474 | 1,364 | 1.813782991 | | 10000 | CryptoRng | 38,607 | 26,286 | 1.468728601 | | 10000 | SmallRng | 27,824 | 16,927 | 1.6437644 |

Shuffle

| Number of Elements | Rng | Old ns/iter | New ns/iter | Ratio (new:old) | | ------------------ | --------- | ----------- | ----------- | --------------- | | 1 | CryptoRng | 0 | 0 | N/A | | 1 | SmallRng | 0 | 0 | N/A | | 2 | CryptoRng | 11 | 8 | 1.375 | | 2 | SmallRng | 8 | 5 | 1.6 | | 3 | CryptoRng | 18 | 9 | 2 | | 3 | SmallRng | 13 | 6 | 2.166666667 | | 10 | CryptoRng | 88 | 25 | 3.52 | | 10 | SmallRng | 61 | 16 | 3.8125 | | 100 | CryptoRng | 872 | 325 | 2.683076923 | | 100 | SmallRng | 552 | 245 | 2.253061224 | | 1000 | CryptoRng | 7,219 | 3,910 | 1.84629156 | | 1000 | SmallRng | 5,057 | 2,650 | 1.908301887 | | 10000 | CryptoRng | 76,061 | 50,041 | 1.519973622 | | 10000 | SmallRng | 55,682 | 32,715 | 1.702032707 |

opened by wainwrightmark 1
Corner case when the range is too large for floats

I just noticed a paper recently, which reports that the implementation of drawing uniform floating point numbers in rand has some imperfections. Specifically, according to the paper rng.gen_range(-f64::MAX..=f64::MAX) will only generate negative floating point numbers.

After a quick search over the issues I didn't find any issued raised with regard to this paper. I just want to post it in case it's not mentioned here.

opened by cmpute 1

Added new versions of choose and choose_stable

Related to #1266

Edit: I have made some performance improvements (now 2-3x as fast as the original and updated the benchmarks)

DO NOT MERGE as is. The old functions need to be removed and the new ones renamed. I added a lot of benchmarks to test the new changes, they don't need to be permanent. I think Coin_Flipper could do with a better name too, and better tests.

This adds new, better performing, versions of choose() and choose_stable() which perform about 1.5x to 2x as fast.
This actually uses a different method to the one described in the issue, it now only needs to generate one u32 per 16 iterator items on average. This works much better than the other version on longer lists.

This is a value breaking change.

Benchmark results below.
Summary: basically unchanged for size_hinted and window_hinted. About 1.5-2x improvement for unhinted and stable, thought these are more modest when using faster RNGs and when there are fewer elements.

use rand_chacha::ChaCha20Rng as CryptoRng;
use rand_pcg::Pcg32 as SmallRng;

# choose unhinted - this is where I was expecting performance gains
test seq_iter_unhinted_choose_from_10000_cryptoRng_new_version        ... bench:      21,323 ns/iter (+/- 2,551)
test seq_iter_unhinted_choose_from_10000_cryptoRng_old_version        ... bench:      76,843 ns/iter (+/- 8,739)
test seq_iter_unhinted_choose_from_10000_smallRng_new_version         ... bench:      19,321 ns/iter (+/- 870)
test seq_iter_unhinted_choose_from_10000_smallRng_old_version         ... bench:      44,929 ns/iter (+/- 2,894)
test seq_iter_unhinted_choose_from_1000__cryptoRng_new_version         ... bench:       2,233 ns/iter (+/- 198)
test seq_iter_unhinted_choose_from_1000__cryptoRng_old_version         ... bench:       7,019 ns/iter (+/- 434)
test seq_iter_unhinted_choose_from_1000__smallRng_new_version          ... bench:       2,028 ns/iter (+/- 174)
test seq_iter_unhinted_choose_from_1000__smallRng_old_version          ... bench:       4,028 ns/iter (+/- 512)
test seq_iter_unhinted_choose_from_100___cryptoRng_new_version          ... bench:         281 ns/iter (+/- 18)
test seq_iter_unhinted_choose_from_100___cryptoRng_old_version          ... bench:         792 ns/iter (+/- 68)
test seq_iter_unhinted_choose_from_100___smallRng_new_version           ... bench:         269 ns/iter (+/- 16)
test seq_iter_unhinted_choose_from_100___smallRng_old_version           ... bench:         481 ns/iter (+/- 42)
test seq_iter_unhinted_choose_from_10____cryptoRng_new_version           ... bench:          45 ns/iter (+/- 7)
test seq_iter_unhinted_choose_from_10____cryptoRng_old_version           ... bench:         100 ns/iter (+/- 13)
test seq_iter_unhinted_choose_from_10____smallRng_new_version            ... bench:          45 ns/iter (+/- 6)
test seq_iter_unhinted_choose_from_10____smallRng_old_version            ... bench:          64 ns/iter (+/- 5)

# choose_stable() should be very close to unhinted
test seq_iter_stable_choose_from_10000_cryptoRng_new_version          ... bench:      20,740 ns/iter (+/- 1,149)
test seq_iter_stable_choose_from_10000_cryptoRng_old_version          ... bench:      70,251 ns/iter (+/- 7,609)
test seq_iter_stable_choose_from_10000_smallRng_new_version           ... bench:      19,237 ns/iter (+/- 1,996)
test seq_iter_stable_choose_from_10000_smallRng_old_version           ... bench:      45,256 ns/iter (+/- 5,011)
test seq_iter_stable_choose_from_1000__cryptoRng_new_version           ... bench:       2,175 ns/iter (+/- 132)
test seq_iter_stable_choose_from_1000__cryptoRng_old_version           ... bench:       6,455 ns/iter (+/- 1,272)
test seq_iter_stable_choose_from_1000__smallRng_new_version            ... bench:       2,019 ns/iter (+/- 195)
test seq_iter_stable_choose_from_1000__smallRng_old_version            ... bench:       4,132 ns/iter (+/- 391)
test seq_iter_stable_choose_from_100___cryptoRng_new_version            ... bench:         289 ns/iter (+/- 27)
test seq_iter_stable_choose_from_100___cryptoRng_old_version            ... bench:         735 ns/iter (+/- 54)
test seq_iter_stable_choose_from_100___smallRng_new_version             ... bench:         268 ns/iter (+/- 34)
test seq_iter_stable_choose_from_100___smallRng_old_version             ... bench:         478 ns/iter (+/- 41)
test seq_iter_stable_choose_from_10____cryptoRng_new_version             ... bench:          47 ns/iter (+/- 5)
test seq_iter_stable_choose_from_10____cryptoRng_old_version             ... bench:          92 ns/iter (+/- 6)
test seq_iter_stable_choose_from_10____smallRng_new_version              ... bench:          43 ns/iter (+/- 3)
test seq_iter_stable_choose_from_10____smallRng_old_version              ... bench:          64 ns/iter (+/- 4)

# Window hinted - maybe a slight improvement
test seq_iter_window_hinted_choose_from_10000_cryptoRng_new_version   ... bench:      16,382 ns/iter (+/- 1,672)
test seq_iter_window_hinted_choose_from_10000_cryptoRng_old_version   ... bench:      17,044 ns/iter (+/- 2,429)
test seq_iter_window_hinted_choose_from_10000_smallRng_new_version    ... bench:      10,606 ns/iter (+/- 347)
test seq_iter_window_hinted_choose_from_10000_smallRng_old_version    ... bench:      11,415 ns/iter (+/- 974)
test seq_iter_window_hinted_choose_from_1000__cryptoRng_new_version    ... bench:       1,558 ns/iter (+/- 163)
test seq_iter_window_hinted_choose_from_1000__cryptoRng_old_version    ... bench:       1,674 ns/iter (+/- 190)
test seq_iter_window_hinted_choose_from_1000__smallRng_new_version     ... bench:       1,037 ns/iter (+/- 108)
test seq_iter_window_hinted_choose_from_1000__smallRng_old_version     ... bench:       1,106 ns/iter (+/- 93)
test seq_iter_window_hinted_choose_from_100___cryptoRng_new_version     ... bench:         200 ns/iter (+/- 17)
test seq_iter_window_hinted_choose_from_100___cryptoRng_old_version     ... bench:         214 ns/iter (+/- 13)
test seq_iter_window_hinted_choose_from_100___smallRng_new_version      ... bench:         145 ns/iter (+/- 23)
test seq_iter_window_hinted_choose_from_100___smallRng_old_version      ... bench:         149 ns/iter (+/- 12)
test seq_iter_window_hinted_choose_from_10____cryptoRng_new_version      ... bench:          32 ns/iter (+/- 7)
test seq_iter_window_hinted_choose_from_10____cryptoRng_old_version      ... bench:          34 ns/iter (+/- 3)
test seq_iter_window_hinted_choose_from_10____smallRng_new_version       ... bench:          24 ns/iter (+/- 2)
test seq_iter_window_hinted_choose_from_10____smallRng_old_version       ... bench:          26 ns/iter (+/- 3)


# Size hinted - basically no noticeable change - note that these run 1000 times each 
test seq_iter_size_hinted_choose_from_10000_cryptoRng_new_version     ... bench:       8,389 ns/iter (+/- 422) = 953 MB/s
test seq_iter_size_hinted_choose_from_10000_cryptoRng_old_version     ... bench:       8,402 ns/iter (+/- 661) = 952 MB/s
test seq_iter_size_hinted_choose_from_10000_smallRng_new_version      ... bench:       5,325 ns/iter (+/- 521) = 1502 MB/s
test seq_iter_size_hinted_choose_from_10000_smallRng_old_version      ... bench:       5,401 ns/iter (+/- 211) = 1481 MB/s
test seq_iter_size_hinted_choose_from_1000__cryptoRng_new_version      ... bench:       2,654 ns/iter (+/- 219) = 3014 MB/s
test seq_iter_size_hinted_choose_from_1000__cryptoRng_old_version      ... bench:       2,714 ns/iter (+/- 354) = 2947 MB/s
test seq_iter_size_hinted_choose_from_1000__smallRng_new_version       ... bench:       1,211 ns/iter (+/- 45) = 6606 MB/s
test seq_iter_size_hinted_choose_from_1000__smallRng_old_version       ... bench:       1,188 ns/iter (+/- 43) = 6734 MB/s
test seq_iter_size_hinted_choose_from_100___cryptoRng_new_version       ... bench:       5,330 ns/iter (+/- 636) = 1500 MB/s
test seq_iter_size_hinted_choose_from_100___cryptoRng_old_version       ... bench:       5,268 ns/iter (+/- 561) = 1518 MB/s
test seq_iter_size_hinted_choose_from_100___smallRng_new_version        ... bench:       2,948 ns/iter (+/- 406) = 2713 MB/s
test seq_iter_size_hinted_choose_from_100___smallRng_old_version        ... bench:       2,996 ns/iter (+/- 344) = 2670 MB/s
test seq_iter_size_hinted_choose_from_10____cryptoRng_new_version        ... bench:       8,266 ns/iter (+/- 925) = 967 MB/s
test seq_iter_size_hinted_choose_from_10____cryptoRng_old_version        ... bench:       8,873 ns/iter (+/- 1,017) = 901 MB/s
test seq_iter_size_hinted_choose_from_10____smallRng_new_version         ... bench:       5,246 ns/iter (+/- 482) = 1524 MB/s
test seq_iter_size_hinted_choose_from_10____smallRng_old_version         ... bench:       5,036 ns/iter (+/- 385) = 1588 MB/s

opened by wainwrightmark 8

CHANGE: Performance Optimizations for `IteratorRandom::choose()` and related methods
Summary

IteratorRandom::choose() and related methods call gen_range() much more frequently than strictly necessary when choosing from iterators without size hints. By reducing the number of these calls it is possible to speed up the relevant benchmarks by about 1.5-2x

Specifically

# Using Pcg32 test seq_iter_unhinted_choose_from_1000 ... bench: 4,049 ns/iter (+/- 377) #old test seq_iter_unhinted_choose_from_1000 ... bench: 2,730 ns/iter (+/- 335) #new # Using ChaChaRng test seq_iter_unhinted_choose_from_1000 ... bench: 6,831 ns/iter (+/- 317) #old test seq_iter_unhinted_choose_from_1000 ... bench: 3,447 ns/iter (+/- 627) #new

This would not require a breaking change to the API or violate any explicit promises but if this change is made then different items will be randomly returned from the choose() method and the Rng will be in a different state afterwards which may affect some users.

Details

The IteratorRandom::choose() and related methods would have to change.
The way they work now (for iterators without size hints) is that for every item, they generate a random number in the range 0..n where n is the number of items seen so far. If the generated number is zero (i.e. a 1/n chance) the current result changes to the new number.

This means that for the first twenty items, twenty random numbers are generated. My suggestion is to instead generate a number in the range 0..20! (20 factorial is the largest factorial less than u64::MAX) and use this random number instead of the first twenty random numbers.

A number in that range has a 1 in 2 chance of being 0 mod 2, then after division by 2, it has a 1 in 3 chance of being 0 mod 3 and so on up to 20.

After the first twenty numbers you then have to use 33!/20! which only gives you 13 numbers and the efficiency gradually decreases with larger n. This seems like a lot of work but it's still much cheaper than generating new random numbers, especially with cryptographically secure rngs.

There is no measurable performance change when using choose() with a fixed slice iterator (the code for that part is unchanged).
When using an iterator with a size hint but not a fixed size, performance seems to have degraded very slightly. It would be possible to apply this optimization to that case as well though.

I believe the results of the functions would be consistent between 32 and 64 bit architectures (I am generating u64, not usize). I don't have 32 bit hardware to test on so I can't comment on the performance differences.

The following methods would also probably benefit from similar optimizations:

IteratorRandom::choose_stable() IteratorRandom::choose_multiple_fill() IteratorRandom::choose_multiple() SliceRandom::shuffle() SliceRandom::partial_shuffle()

Motivation

Performance.

Alternatives

The code could be left as it is and performance hungry users could use my crate which has a fast version of the choose function (I made the crate just to have a random_max function but got carried away optimizing it).

If this is considered worth doing, I'm happy to submit a PR. I have written most of the code already for my own crate.

Have a lovely day, Mark
opened by wainwrightmark 4
Drawing from multivariate normal distribution

Background

I have looked at the documentation, and it looks like this is currently not possible (?)

What is your motivation? I keep going back to Python and Numpy's multivariate_normal whenever I need this, and I was wondering if it was within the scope of the rand_distr crate.

Feature request

I would like to be able to draw samples from a multivariate normal distribution given a covariance matrix.

opened by DJDuque 2