Super-fast float parser in Rust

Ivan Smirnov

Last update: Dec 10, 2022

Related tags

Overview

fast-float

This crate provides a super-fast decimal number parser from strings into floats.

[dependencies]
fast-float = "0.2"

There are no dependencies and the crate can be used in a no_std context by disabling the "std" feature.

Compiler support: rustc 1.37+.

Usage

There's two top-level functions provided: parse() and parse_partial(), both taking either a string or a bytes slice and parsing the input into either f32 or f64:

parse() treats the whole string as a decimal number and returns an error if there are invalid characters or if the string is empty.
parse_partial() tries to find the longest substring at the beginning of the given input string that can be parsed as a decimal number and, in the case of success, returns the parsed value along the number of characters processed; an error is returned if the string doesn't start with a decimal number or if it is empty. This function is most useful as a building block when constructing more complex parsers, or when parsing streams of data.

Example:

// Parse the entire string as a decimal number.
let s = "1.23e-02";
let x: f32 = fast_float::parse(s).unwrap();
assert_eq!(x, 0.0123);

// Parse as many characters as possible as a decimal number.
let s = "1.23e-02foo";
let (x, n) = fast_float::parse_partial::<f32, _>(s).unwrap();
assert_eq!(x, 0.0123);
assert_eq!(n, 8);
assert_eq!(&s[n..], "foo");

Details

This crate is a direct port of Daniel Lemire's fast_float C++ library (valuable discussions with Daniel while porting it helped shape the crate and get it to the performance level it's at now), with some Rust-specific tweaks. Please see the original repository for many useful details regarding the algorithm and the implementation.

The parser is locale-independent. The resulting value is the closest floating-point values (using either f32 or f64), using the "round to even" convention for values that would otherwise fall right in-between two values. That is, we provide exact parsing according to the IEEE standard.

Infinity and NaN values can be parsed, along with scientific notation.

Both little-endian and big-endian platforms are equally supported, with extra optimizations enabled on little-endian architectures.

Testing

There are a few ways this crate is tested:

A suite of explicit tests (taken from the original library) covering lots of edge cases.
A file-based test suite (taken from the original library; credits to Nigel Tao), ~5M tests.
All 4B float32 numbers are exhaustively roundtripped via ryu formatter.
Roundtripping a large quantity of random float64 numbers via ryu formatter.
Roundtripping float64 numbers and fuzzing random input strings via cargo-fuzz.
All explicit test suites run on CI; roundtripping and fuzzing are run manually.

Performance

The presented parser seems to beat all of the existing C/C++/Rust float parsers known to us at the moment by a large margin, in all of the datasets we tested it on so far – see detailed benchmarks below (the only exception being the original fast_float C++ library, of course – performance of which is within noise bounds of this crate). On modern machines like Apple M1, parsing throughput can reach up to 1.5 GB/s.

In particular, it is faster than Rust standard library's FromStr::from_str() by a factor of 2-8x (larger factor for longer float strings), and is typically 2-3x faster than the nearest competitors.

While various details regarding the algorithm can be found in the repository for the original C++ library, here are few brief notes:

The parser is specialized to work lightning-fast on inputs with at most 19 significant digits (which constitutes the so called "fast-path"). We believe that most real-life inputs should fall under this category, and we treat longer inputs as "degenerate" edge cases since it inevitable causes overflows and loss of precision.
If the significand happens to be longer than 19 digits, the parser falls back to the "slow path", in which case its performance roughly matches that of the top Rust/C++ libraries (and still beats them most of the time, although not by a lot).
On little-endian systems, there's additional optimizations for numbers with more than 8 digits after the decimal point.

Benchmarks

Below are tables of best timings in nanoseconds for parsing a single number into a 64-bit float.

Intel i7-4771

Intel i7-4771 3.5GHz, macOS, Rust 1.49.

	`canada`	`mesh`	`uniform`	`iidi`	`iei`	`rec32`
fast-float	21.58	10.70	19.36	40.50	26.07	29.13
lexical	65.90	23.28	54.75	75.80	52.18	75.36
from_str	174.43	22.30	99.93	227.76	111.31	204.46
fast_float (C++)	22.78	10.99	20.05	41.12	27.51	30.85
abseil (C++)	42.66	32.88	46.01	50.83	46.33	49.95
netlib (C)	57.53	24.86	64.72	56.63	36.20	67.29
strtod (C)	286.10	31.15	258.73	295.73	205.72	315.95

Apple M1

Apple M1, macOS, Rust 1.49.

	`canada`	`mesh`	`uniform`	`iidi`	`iei`	`rec32`
fast-float	14.84	5.98	11.24	33.24	21.30	17.86
lexical	47.09	16.51	43.46	56.06	36.68	55.48
from_str	136.00	13.84	74.64	179.87	77.91	154.53
fast_float (C++)	13.71	7.28	11.71	32.94	20.64	18.30
abseil (C++)	36.55	24.20	38.48	40.86	35.46	40.09
netlib (C)	47.19	14.12	48.85	52.28	33.70	48.79
strtod (C)	176.13	21.48	165.43	187.98	132.19	190.63

AMD Rome

AMD Rome, Linux, Rust 1.49.

	`canada`	`mesh`	`uniform`	`iidi`	`iei`	`rec32`
fast-float	25.90	12.12	20.54	47.01	29.23	32.36
lexical	63.18	22.13	54.78	81.23	55.06	79.14
from_str	190.06	26.10	102.44	239.87	119.04	211.73
fast_float (C++)	21.29	10.47	18.31	42.33	24.56	29.76
abseil (C++)	44.54	34.13	47.38	52.64	43.77	53.03
netlib (C)	69.43	23.31	79.98	72.17	35.81	86.91
strtod (C)	123.37	65.68	101.58	118.36	118.61	123.72

Parsers

fast-float - this very crate
lexical – lexical_core, v0.7 (non-lossy; same performance as lossy)
from_str – Rust standard library, FromStr trait
fast_float (C++) – original C++ implementation of 'fast-float' method
abseil (C++) – Abseil C++ Common Libraries
netlib (C++) – C++ Network Library
strtod (C) – C standard library

Datasets

canada – numbers in canada.txt file
mesh – numbers in mesh.txt file
uniform – uniform random numbers from 0 to 1
iidi – random numbers of format %d%d.%d
iei – random numbers of format %de%d
rec32 – reciprocals of random 32-bit integers

Notes

The two test files referred above can be found in this repository.
The Rust part of the table (along with a few other benchmarks) can be generated via the benchmark tool that can be found under extras/simple-bench of this repo.
The C/C++ part of the table (along with a few other benchmarks and parsers) can be generated via a C++ utility that can be found in this repository.

References

Daniel Lemire, Number Parsing at a Gigabyte per Second, arXiv:2101.11408

License

^{Licensed under either of Apache License, Version 2.0 or MIT license at your option.}
_{Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.}

Comments

Consider integration in core and std ?

Have you consider replacing the dec2flt algorithm in the standard library by your crate ?

I remember have seen several members of the Rust Libs Team encouraging people how have better algorithm (fast, efficient, ...) to integrate their crate in the std library.

Source of dec2flt

opened by Urgau 76

Results with Apple M1 processor

❯ rustc --version
rustc 1.49.0 (e1884a8e3 2020-12-29)

~
❯ uname -a
Darwin MacBookPro2020 20.2.0 Darwin Kernel Version 20.2.0: Wed Dec  2 20:40:21 PST 2020; root:xnu-7195.60.75~1/RELEASE_ARM64_T8101 arm64


cargo run --release -- file /data/canada.txt
    Finished release [optimized] target(s) in 0.01s
     Running `/Users/lemire/CVS/github/fast-float-rust/target/release/fast-float-simple-bench file /Users/lemire/CVS/github/fast_double_parser/benchmarks/data/canada.txt`
=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            14.96    14.96    14.98    14.99    15.00    15.11    16.54 |
| lexical               47.55    47.57    47.59    47.63    47.82    48.52    51.50 |
| lexical/lossy         47.60    47.62    47.65    47.72    47.88    48.57    52.71 |
| from_str             136.74   137.03   137.60   138.06   138.48   141.75   288.85 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            60.45    66.20    66.65    66.73    66.74    66.84    66.86 |
| lexical               19.42    20.61    20.92    20.99    21.01    21.02    21.03 |
| lexical/lossy         18.97    20.60    20.89    20.96    20.99    21.00    21.01 |
| from_str               3.46     7.06     7.22     7.24     7.27     7.30     7.31 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float          1051.83  1151.95  1159.75  1161.21  1161.44  1163.18  1163.50 |
| lexical              337.91   358.70   363.96   365.31   365.63   365.77   365.94 |
| lexical/lossy        330.12   358.50   363.45   364.68   365.17   365.46   365.60 |
| from_str              60.24   122.79   125.67   126.04   126.47   126.99   127.26 |
|                                                                                   |
=====================================================================================

❯ cargo run --release -- random uniform

    Finished release [optimized] target(s) in 0.01s
     Running `/Users/lemire/CVS/github/fast-float-rust/target/release/fast-float-simple-bench random uniform`
=====================================================================================
|                           uniform (50000, 0.87 MB, f64)                           |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            11.52    11.53    11.53    11.54    11.59    12.12    52.19 |
| lexical               42.92    42.93    42.96    42.98    43.01    43.35    47.00 |
| lexical/lossy         42.77    42.78    42.84    42.95    43.52    44.51    47.42 |
| from_str              74.71    74.78    74.87    74.92    75.01    75.46    81.47 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            19.16    82.50    86.28    86.66    86.70    86.73    86.80 |
| lexical               21.28    23.07    23.25    23.27    23.28    23.30    23.30 |
| lexical/lossy         21.09    22.47    22.98    23.28    23.34    23.37    23.38 |
| from_str              12.27    13.25    13.33    13.35    13.36    13.37    13.38 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           333.92  1437.58  1503.42  1510.15  1510.80  1511.35  1512.55 |
| lexical              370.80   402.07   405.16   405.43   405.67   405.94   405.98 |
| lexical/lossy        367.45   391.49   400.47   405.73   406.81   407.31   407.38 |
| from_str             213.89   230.92   232.31   232.58   232.75   233.02   233.23 |
|                                                                                   |
=====================================================================================

opened by lemire 11

Wrong value with very long floats

For some floats with a long trail of zeroes, the value is not correctly rounded after a sufficient amount of zeroes.

Code:

use fast_float::*;

fn main() {
    let base = "1.1754941406275178592461758986628081843312458647327962400313859427181746759860647699724722770042717456817626953125";
    let mut a = String::from(base);
    let mut b = a.clone();
    a.push_str(&"0".repeat(655));
    a.push_str("e-38");
    b.push_str(&"0".repeat(656));
    b.push_str("e-38");
    let a = parse::<f32, &str>(&a).unwrap();
    let b = parse::<f32, &str>(&b).unwrap();
    println!("a: {:x}", a.to_bits());
    println!("b: {:x}", b.to_bits());
}

Output:

a: 7ffffe
b: 7fffff

Works in debug and release modes

rustc --version --verbose

rustc 1.47.0-nightly (6c8927b0c 2020-07-26)
binary: rustc
commit-hash: 6c8927b0cf80ceee19386026cf9d7fd4fd9d486f
commit-date: 2020-07-26
host: x86_64-unknown-linux-gnu
release: 1.47.0-nightly
LLVM version: 10.0

opened by AnttiParaoanu 6

Remove Unncessary Unsafe and Endian-Dependent Codepaths

Issue

Float::from_bits is implemented as mem::transmute, and therefore does exactly what we need in all cases.

Solution

We don't need an endian-dependent code-path: we can just use the mask 0xFFFFFFFF to ensure it works, which will grab the least-significant 32 bits, and then create a float directly from those bits.

pub unsafe fn to_float_v1(word: u64) -> f32 {
    if cfg!(target_endian = "big") {
        *(&word as *const _ as *const f32).add(1)
    } else {
        *(&word as *const _ as *const f32)
    }
}

pub fn to_float_v2(word: u64) -> f32 {
    f32::from_bits((word & 0xFFFFFFFF) as u32)
}

example::to_float_v1:
        movd    xmm0, edi
        ret

example::to_float_v2:
        movd    xmm0, edi
        ret

The full implementation therefore could be:

diff --git a/src/parse.rs b/src/parse.rs
index 5571da7..9c592d4 100644
--- a/src/parse.rs
+++ b/src/parse.rs
@@ -1,5 +1,3 @@
-use core::mem;
-
 use crate::binary::compute_float;
 use crate::float::Float;
 use crate::number::{parse_inf_nan, parse_number};
@@ -32,13 +30,5 @@ pub fn parse_float<F: Float>(s: &[u8]) -> Option<(F, usize)> {
     if num.negative {
         word |= 1_u64 << F::SIGN_INDEX;
     }
-    let value = unsafe {
-        if cfg!(target_endian = "big") && mem::size_of::<F>() == 4 {
-            *(&word as *const _ as *const F).add(1)
-        } else {
-            *(&word as *const _ as *const F)
-        }
-    };
-
-    Some((value, rest))
+    Some((F::from_u64_bits(word), rest))
 }
diff --git a/src/float.rs b/src/float.rs
index 39bec41..a976408 100644
--- a/src/float.rs
+++ b/src/float.rs
@@ -40,6 +40,7 @@ pub trait Float:
     const MAX_MANTISSA_FAST_PATH: u64 = 2_u64 << Self::MANTISSA_EXPLICIT_BITS;
 
     fn from_u64(v: u64) -> Self;
+    fn from_u64_bits(v: u64) -> Self;
     fn pow10_fast_path(exponent: usize) -> Self;
 }
 
@@ -67,6 +68,11 @@ impl Float for f32 {
         v as _
     }
 
+    #[inline]
+    fn from_u64_bits(v: u64) -> Self {
+        f32::from_bits((v & 0xFFFFFFFF) as u32)
+    }
+
     #[inline]
     fn pow10_fast_path(exponent: usize) -> Self {
         #[allow(clippy::use_self)]
@@ -101,6 +107,11 @@ impl Float for f64 {
         v as _
     }
 
+    #[inline]
+    fn from_u64_bits(v: u64) -> Self {
+        f64::from_bits(v)
+    }
+
     #[inline]
     fn pow10_fast_path(exponent: usize) -> Self {
         #[allow(clippy::use_self)]

This was tested on both little-endian and big-endian platforms using the following cross targets:

powerpc-unknown-linux-gnu
x86_64-unknown-linux-gnu

opened by Alexhuszagh 5

already merged into std?

Hi,

I'm curious if it's still worth it using this crate or hasn't std already merged in this code according to?: https://github.com/rust-lang/rust/pull/86761

So just doing f32::from_str(the_str).unwrap() should get the float parsed fast now with Lemire algorithm right?

opened by evbo 4

Added read_u64 optimizations to big endian.

Fixes #26.

Rationale

This should produce the same byte-code, and remove all endian-dependent codepaths, given that the following are true:

u64::from_le and u64::to_le are no-ops on little-endian architectures.
u64::from_le and u64::to_le are very cheap on big-endian architectures.
ptr::read_unaligned and ptr::write_unaligned are identical to ptr::copy_nonoverlapping(src, dst, mem::size_of::<T>())

The first 2 are trivial to show that they're true:

to_le and from_le are no-ops on little-endian, and cheap on big-endian.

For 3, we can see that read_unaligned is effectively identical to ptr::copy_nonoverlapping(src, dst, mem::size_of::<T>()), as long as MaybeUninit compiles down to no instructions.

Using the following source, we can see they're identical (on little-endian systems):

use std::ptr;

pub fn write_u64_v1(bytes: &mut [u8], value: u64) {
    let src = &value as *const _ as *const u8;
    let dst = bytes.as_mut_ptr();
    unsafe { ptr::copy_nonoverlapping(src, dst, 8) };
}

pub fn write_u64_v2(bytes: &mut [u8], value: u64) {
    let dst = bytes.as_mut_ptr() as *mut u64;
    unsafe { ptr::write_unaligned(dst, u64::to_le(value)) };
}

pub fn read_u64_v1(bytes: &[u8]) -> u64 {
    let mut value = 0_u64;
    let src = bytes.as_ptr();
    let dst = &mut value as *mut _ as *mut u8;
    unsafe { ptr::copy_nonoverlapping(src, dst, 8) };
    value
}

pub fn read_u64_v2(bytes: &[u8]) -> u64 {
    let src = bytes.as_ptr() as *const u64;
    u64::from_le(unsafe { ptr::read_unaligned(src) })
}

Compiled with -C opt-level=3, we can see the x86_64 assembly is identical.

example::read_u64_v1:
        mov     rax, qword ptr [rdi]
        ret

example::read_u64_v2:
        mov     rax, qword ptr [rdi]
        ret

example::write_u64_v1:
        mov     qword ptr [rdi], rdx
        ret

example::write_u64_v2:
        mov     qword ptr [rdi], rdx
        ret

This also includes tests to ensure that both big-endian and little-endian systems read the bytes the same way.

Correctness Concerns

Should be non-existent, since as long as the value is read and written to the same native integer, then all the integer operations will produce the same result no matter the byte-order of the architecture. Tests using b"01234567" are included for both read_u64 and write_u64, which should confirm it produces the integer 0x3736353433323130. If we did not use to_le and from_le, we'd expect the opposite byte-order, or 0x3031323334353637 (which would correspond to bytes of b"76543210" in little-endian). In short, we've confirmed we've gotten the proper result, and we've provided a significant optimization for big-endian architectures, and simplified a few functions.

Alternatives

We could change all the masks and operations to check if the digits are correct to big-endian, however, this might require some additional effort to check correctness, and might require changes in many more locations. Since swapping the byte-order of an integer is effectively free in the grand scheme of things, this should be satisfactory.

Benchmarks

The benchmarks on big-endian are emulated via Qemu, and therefore should be taken with a grain of salt. However, the performance for little-endian systems is identical, and the (emulated) performance improves for big-endian systems.

Little-Endian (Native), `read_u64`

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            27.13    27.29    27.66    28.09    28.56    29.90    44.77 |
| lexical               75.72    76.36    76.86    77.39    78.48    80.79   100.68 |
| from_str             200.21   200.92   201.65   202.70   204.25   209.90   314.91 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            22.34    33.51    35.02    35.61    36.15    36.65    36.86 |
| lexical                9.93    12.39    12.75    12.92    13.01    13.10    13.21 |
| from_str               3.18     4.77     4.90     4.93     4.96     4.98     4.99 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           388.73   583.04   609.39   619.59   629.11   637.69   641.38 |
| lexical              172.84   215.56   221.78   224.87   226.42   227.89   229.81 |
| from_str              55.26    82.92    85.20    85.85    86.29    86.61    86.92 |
|                                                                                   |
=====================================================================================

Little-Endian (Native), `master`

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            27.12    27.30    27.66    28.01    28.41    29.42    36.89 |
| lexical               75.76    75.98    76.48    76.98    77.95    81.02    96.75 |
| from_str             200.38   201.01   201.69   202.55   204.46   209.63   230.14 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            27.10    34.00    35.20    35.70    36.16    36.63    36.88 |
| lexical               10.34    12.35    12.83    12.99    13.08    13.16    13.20 |
| from_str               4.35     4.77     4.89     4.94     4.96     4.98     4.99 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           471.66   591.63   612.61   621.26   629.17   637.35   641.71 |
| lexical              179.86   214.90   223.25   226.07   227.53   229.01   229.69 |
| from_str              75.61    83.04    85.11    85.91    86.28    86.57    86.84 |
|                                                                                   |
=====================================================================================

Big-Endian (powerpc-unknown-linux-gnu), `read_u64`

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           239.00   240.00   240.81   241.40   242.22   245.25   270.75 |
| lexical              600.54   603.88   607.95   614.02   617.52   629.30   859.01 |
| from_str            1318.93  1325.09  1328.26  1331.44  1335.09  1349.77  1497.10 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float             3.69     4.08     4.13     4.14     4.15     4.17     4.18 |
| lexical                1.16     1.59     1.62     1.63     1.64     1.66     1.67 |
| from_str               0.67     0.74     0.75     0.75     0.75     0.75     0.76 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            64.27    70.96    71.84    72.09    72.26    72.51    72.81 |
| lexical               20.26    27.67    28.18    28.34    28.62    28.82    28.98 |
| from_str              11.62    12.89    13.03    13.07    13.10    13.13    13.19 |
|                                                                                   |
=====================================================================================

Big-Endian (powerpc-unknown-linux-gnu), `master`

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           259.11   261.30   262.17   262.88   263.73   267.88   302.34 |
| lexical              613.42   614.97   616.32   617.60   619.06   624.77   672.53 |
| from_str            1319.05  1328.78  1351.89  1357.88  1361.90  1374.38  1481.66 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float             3.31     3.73     3.79     3.80     3.81     3.83     3.86 |
| lexical                1.49     1.60     1.62     1.62     1.62     1.63     1.63 |
| from_str               0.67     0.73     0.73     0.74     0.74     0.75     0.76 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            57.56    64.96    65.98    66.20    66.38    66.60    67.16 |
| lexical               25.87    27.86    28.11    28.18    28.23    28.30    28.37 |
| from_str              11.74    12.66    12.78    12.82    12.87    13.10    13.19 |
|                                                                                   |
=====================================================================================

opened by Alexhuszagh 3

remove unsafe from float.rs at very small performance cost

OLD:

min/median/max

ns/float: 22.29/23.64/28.55 Mfloat/s: 44.86/42.31/35.03 MB/s: 781.57/737.14/610.28

NEW: ns/float: 22.9/24.94/30.69 Mfloat/s: 43.67/40.10/32.59 MB/s: 760/87/698.71/567.77

I'm going to continue working on this to try and improve it as much as possible.

opened by Protowalker 3

Results with AMD Rome (Zen 2) processor

$ rustc --version
rustc 1.49.0 (e1884a8e3 2020-12-29)

$ uname -a
Linux rome 3.10.0-1062.4.3.el7.x86_64 #1 SMP Wed Nov 13 23:58:53 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

$ cargo run --release -- file /data/canada.txt
    Finished release [optimized] target(s) in 0.02s
     Running `/home/dlemire/CVS/github/fast-float-rust/target/release/fast-float-simple-bench file /home/dlemire/CVS/github/tmp/fast_double_parser/benchmarks/data/canada.txt`
=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            25.90    25.93    25.98    26.01    26.04    26.06    26.38 |
| lexical               63.18    63.23    63.29    63.33    63.37    63.44    63.57 |
| lexical/lossy         63.25    63.32    63.37    63.41    63.45    63.52    63.78 |
| from_str             190.06   190.23   190.33   190.39   190.46   190.57   203.78 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            37.90    38.37    38.41    38.45    38.48    38.57    38.60 |
| lexical               15.73    15.76    15.78    15.79    15.80    15.81    15.83 |
| lexical/lossy         15.68    15.74    15.76    15.77    15.78    15.79    15.81 |
| from_str               4.91     5.25     5.25     5.25     5.25     5.26     5.26 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           659.60   667.77   668.32   669.13   669.68   671.12   671.74 |
| lexical              273.75   274.29   274.58   274.77   274.95   275.19   275.42 |
| lexical/lossy        272.83   273.94   274.23   274.41   274.59   274.80   275.10 |
| from_str              85.39    91.32    91.36    91.40    91.43    91.47    91.56 |
|                                                                                   |
=====================================================================================

$ cargo run --release -- random uniform
=====================================================================================
|                           uniform (50000, 0.87 MB, f64)                           |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            20.54    20.57    20.58    20.59    20.61    20.70    21.43 |
| lexical               54.78    54.84    54.86    54.88    54.91    54.99    55.72 |
| lexical/lossy         54.77    54.81    54.84    54.86    54.89    54.98    55.76 |
| from_str             103.56   103.63   103.71   103.80   103.90   104.05   104.86 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            46.65    48.32    48.52    48.57    48.59    48.62    48.68 |
| lexical               17.95    18.19    18.21    18.22    18.23    18.24    18.26 |
| lexical/lossy         17.93    18.19    18.22    18.23    18.24    18.25    18.26 |
| from_str               9.54     9.61     9.62     9.63     9.64     9.65     9.66 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           812.97   842.00   845.54   846.33   846.75   847.31   848.21 |
| lexical              312.73   316.89   317.35   317.52   317.64   317.78   318.12 |
| lexical/lossy        312.53   316.96   317.47   317.63   317.77   317.95   318.15 |
| from_str             166.18   167.47   167.72   167.88   168.02   168.15   168.27 |
|                                                                                   |
=====================================================================================

opened by lemire 2

Remove unsafe without sacrificing speed by replacing i64 "exponent" with primitive enum
By replacing the exponent field in the Number struct with two fields, one for an enum representing the different possible values and another for if the exponent is negative, you can remove the unsafe in float.rs:

unsafe { *TABLE.get_unchecked(exponent)}

can be replaced with

TABLE[exponent as usize]

with little to no speed change over the unsafe counterpart. In my (quite limited) testing, the unsafe code actually performed slower because of the dereference.
opened by Protowalker 2
Fix no_std support

The crate does not set no_std even if you disable the std feature, so it can't be used in a no_std environment. A quick one-line fix and we're back in shape 😄

opened by avafloww 1
Fix Undefined Behavior in `check_len`
Comparison between pointers that do no reference the same array (or 1-past the end of the array) is undefined behavior. Quoting the Rust documentation:

If any of the following conditions are violated, the result is Undefined Behavior:

Both the starting and resulting pointer must be either in bounds or one byte past the end of the same allocated object.

The computed offset, in bytes, cannot overflow an isize.

The offset being in bounds cannot rely on “wrapping around” the address space. That is, the infinite-precision sum must fit in a usize.

Likewise, quoting the LLVM Language Reference:

This value only has defined behavior when used as an operand to the ‘indirectbr’ or ‘callbr’instruction, or for comparisons against null. Pointer equality tests between labels addresses results in undefined behavior — though, again, comparison against null is ok, and no label is equal to the null pointer. This may be passed around as an opaque pointer sized value as long as the bits are not inspected. This allows ptrtoint and arithmetic to be performed on these values so long as the original value is reconstituted before the indirectbr or callbr instruction.

Therefore, the following code is undefined behavior. This should likely justify a new version release once this is published.

https://github.com/aldanor/fast-float-rust/blob/ec1b7d4ff98dc177b1f507e53c0f39a611542811/src/common.rs#L76-L79
opened by Alexhuszagh 1
Optimize Performance for Slow-Path Cases

Recently, upon a major rewrite of lexical, noticed that although performance for fast_float and fast-float Rust was by far the most performant of any library, for near-halfway cases, especially those with many digits or large exponents, the performance dwindled rapidly. A detailed description can be found in https://github.com/fastfloat/fast_float/issues/93, and a PR to fix the aforementioned issue can be found in https://github.com/fastfloat/fast_float/pull/96.

As with fast_float, I own all the code required, and am willing to submit it under any license, including public domain. It might take me a while to finish the PR, as I am doing some work with other projects and work, but would be more than happy to submit a PR if there is interest.

opened by Alexhuszagh 1

Support for JSON Numbers

Adds support for #17.

This probably needs significant edits prior to merging, but the benchmarks with the primitive tokenizing (it doesn't use an optimized memchr) still are pretty good (it also has to duplicate a lot of work when parsing, so this isn't really that surprising):

$ /home/ahuszagh/git/fast-float-rust/target/release/fast-float-simple-bench file ext/data/canada.txt 
=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            25.84    26.30    27.14    27.64    28.17    29.85    40.15 |
| fast-float-tokenized    44.33    45.83    46.25    46.71    47.36    49.06    88.16 |
| lexical               74.67    76.35    76.96    77.64    78.73    81.79   132.29 |
| from_str             218.95   220.82   222.62   224.34   226.81   237.21   407.54 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            24.90    33.59    35.51    36.18    36.85    38.02    38.69 |
| fast-float-tokenized    11.34    20.38    21.11    21.41    21.62    21.82    22.56 |
| lexical                7.56    12.23    12.70    12.88    12.99    13.10    13.39 |
| from_str               2.45     4.22     4.41     4.46     4.49     4.53     4.57 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           433.38   584.58   618.00   629.61   641.32   661.66   673.33 |
| fast-float-tokenized   197.39   354.72   367.43   372.58   376.30   379.67   392.55 |
| lexical              131.54   212.84   221.04   224.14   226.11   227.91   233.05 |
| from_str              42.70    73.39    76.73    77.57    78.17    78.81    79.47 |
|                                                                                   |
=====================================================================================

I've added tests to ensure the differences in parsing don't lead to correctness issues, as seen here.

The following methods have been added:

pub trait FastFloat {
    fn parse_from_parts<S: AsRef<[u8]>>(integral: S, fractional: S, exponent: i64, negative: bool) -> Self;
}

pub fn parse_from_parts<T: FastFloat, S: AsRef<[u8]>>(integral: S, fractional: S, exponent: i64, negative: bool) -> T;

In order to share more code, parse_long_mantissa has been changed to accept a Decimal by-value, and the two internal functions have been added:

pub fn parse_number_from_parts(i: &[u8], f: &[u8], e: i64, negative: bool) -> Option<Number>;
pub fn parse_decimal_from_parts(mut i: &[u8], mut f: &[u8], e: i64, negative: bool) -> Decimal;

Any feedback would be great, I'm probably going to need to refactor a bit to increase code re-use. The API can therefore be used like this:

// This is equivalent to parse("-42823146028335318693e-128")
let f: f64 = parse_from_parts("42823146028335318693", "", -128, true);

If an invalid character is found, it merely stops parsing early: no warning is given for invalid input, but it will not undergo any unsafe behavior. It assumes the input is valid, and the documentation clearly reflects this:

/// Parse a pre-tokenized decimal number from string into float.
///
/// This assumes the float has already been tokenized into valid
/// integral and fractional components, and has parsed an optional
/// exponent notation.
///
/// It is up to you to validate and tokenize the input: although
/// this will not error, this might truncate the significant
/// digits as soon as an invalid digit is found. This does not
/// handle special values, such as NaN, INF, or Infinity.
#[inline]
pub fn parse_from_parts<T: FastFloat, S: AsRef<[u8]>>(integral: S, fractional: S, exponent: i64, negative: bool) -> T;

opened by Alexhuszagh 8

[Feature Request] Support JSON Numbers
One of the major motivations for lexical-core's recent developments has been support for parsing floats of different formats, most notably JSON numbers.

A few notable differences exist in comparison to Rust floats, or those in other languages. For example, with rust literals and rust strings. Providing a function of the signature fn parse_tokenized(integral: &[u8], fractional: &[u8], exponent: i32, negative: bool); would therefore allow users to validate their own float formats, while then letting fast-float-rust do the majority of the heavy lifting. It would also not accept special floats.

This should require minimal changes in the parsing implementation, while making the library much more suitable for general-purpose applications.

"NaN" // valid "nan" // invalid "1.23" // valid "1.23e" // invalid "1." // valid ".1" // valid "1.23e5" // valid "+1.23e5" // valid "-1.23e5" // valid

Meanwhile, in JSON, we get the following:

"NaN" // invalid "nan" // invalid "1.23" // valid "1.23e" // invalid "1." // invalid ".1" // invalid "1.23e5" // valid "+1.23e5" // invalid "-1.23e5" // valid

This can extend to various markup languages, like TOML, YAML (which has the same rules as JSON), XML, and others.
opened by Alexhuszagh 5

Releases(v0.2.0)

v0.2.0(Jan 13, 2021)
Fixed an edge case where long decimals with trailing zeros were truncated.

Minor micro-optimization fixes in the fast path parser.

Remove the use of unsafe when querying power-of-10 tables.

Added float64 roundtrip fuzz target.

Added tests for the power-of-5 table using num-bigint.

Improvements and new options in the bench tool.

Updated benchmark timings, added Apple M1 and AMD Rome timings.

Source code(tar.gz)
Source code(zip)
v0.1.0(Jan 10, 2021)

Initial release, fully tested and benchmarked.
Source code(tar.gz)
Source code(zip)