Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Round trip floats #128

Closed
dtolnay opened this issue Aug 2, 2016 · 15 comments
Closed

Round trip floats #128

dtolnay opened this issue Aug 2, 2016 · 15 comments
Labels

Comments

@dtolnay
Copy link
Member

dtolnay commented Aug 2, 2016

Ideally this test would pass.

extern crate serde_json;

#[macro_use]
extern crate quickcheck;

quickcheck! {
    fn floats(n: f64) -> bool {
        let j = serde_json::to_string(&n).unwrap();
        serde_json::from_str::<f64>(&j).unwrap() == n
    }
}

On the printing side grisu2 guarantees that the f64 closest to the string representation is identical to the original input, so the inaccuracy is somewhere on the parsing side.

@Mark-Simulacrum
Copy link

I'm not sure the conclusion that the issue is with parsing is entirely correct (although there do seem to be problems in that area as well), since this test fails in the dtoa crate: test_write(17.922589369123898, "17.922589369123898");.

    thread 'test_f64' panicked at 'assertion failed: `(left == right)` (left: `"17.922589369123899"`, right: `"17.922589369123898"`)', tests/test.rs:36

@dtolnay
Copy link
Member Author

dtolnay commented Aug 2, 2016

@Mark-Simulacrum the f64 that is closest to 17.922589369123899 is identical to the f64 that is closest to 17.922589369123898, so dtoa (grisu2) is correct here.

assert_eq!(17.922589369123898, 17.922589369123899);

As I wrote:

grisu2 guarantees that the f64 closest to the string representation is identical to the original input

@dtolnay
Copy link
Member Author

dtolnay commented Aug 2, 2016

In the extreme, you probably wouldn't expect this test to pass:

test_write(17.92258936912389800000000000000001f64, "17.92258936912389800000000000000001");

@Mark-Simulacrum
Copy link

Hmm, I wonder how Javascript's JSON.parse/JSON.stringify deal with this. JSON.stringify(JSON.parse('{ "test": 17.922589369123898 }')) returns the expected (by me, at least) 17.922589369123898 despite assert_eq!(17.922589369123898, 17.922589369123899); being true in JS. Well, I somewhat expect that decode(encode(any f64)) == the same f64; but I guess that isn't the case at least for serde-json and json-rust: perhaps this isn't a good requirement.

@dtolnay
Copy link
Member Author

dtolnay commented Aug 2, 2016

@Mark-Simulacrum using f64 you cannot have both of these return "the expected":

JSON.stringify(JSON.parse('{ "test": 17.922589369123898 }'))
JSON.stringify(JSON.parse('{ "test": 17.922589369123899 }'))

because those are the same f64.

@dtolnay
Copy link
Member Author

dtolnay commented Aug 2, 2016

@Mark-Simulacrum

Well, I somewhat expect that decode(encode(any f64)) == the same f64; but I guess that isn't the case at least for serde-json and json-rust: perhaps this isn't a good requirement.

Actually that is a good requirement. That is what this issue is asking for. That is what grisu2 guarantees should work if your parsing is good. This issue is for making our parsing good.

The JS code you gave is a different (not good) requirement of encode(decode(str)) == str.

@Mark-Simulacrum
Copy link

Okay. Well, I'd be glad to help out, but I'm not sure where to start--or if you'd even like help :).

@dtolnay
Copy link
Member Author

dtolnay commented Aug 2, 2016

We would love help as always! The number-parsing code starts here: de.rs#L224. Currently if it sees something like 17.922589369123899 then it computes 17922589369123899u64 as f64 / 10e14 which is pretty good but we need a better algorithm. A place to start would be looking at how RadidJSON handles this case.

@Mark-Simulacrum
Copy link

Mark-Simulacrum commented Aug 2, 2016

I've started looking into it, but haven't had any luck yet. I don't want to block anyone else from working on it, especially since I probably won't have much luck with it--I'm not very proficient at debugging parsing/serializing code.

My initial findings seem to point to u64 as f64 not being a safe cast (we lose ~3 digits of precision there). I'm not sure how to work around that as of yet though.

@dtolnay
Copy link
Member Author

dtolnay commented Aug 2, 2016

Looks like RapidJSON does the same thing we do. Here is the cast and here is the division.

@dtolnay
Copy link
Member Author

dtolnay commented Feb 26, 2017

Let's follow up with a value -> string -> value roundtripping quickcheck once this is implemented. #138 has most of the code.

@mthiesen
Copy link

This is what V8 uses for parsing strings to floats. Supposedly this guarantees correct round tripping of IEEE floats.

@dtolnay dtolnay removed their assignment May 7, 2018
C-Saunders added a commit to C-Saunders/google_analytics_v4_report_flattener that referenced this issue Sep 23, 2018
The test report added here  was manually modified, so the mimumims,
totals, etc. are not correct. This was done to shorten the test
assertions and to get around an issue with string to double parsing
in serde_json (serde-rs/json#128) that made
assertions around the many-decimaled values unreliable.
C-Saunders added a commit to C-Saunders/google_analytics_v4_report_flattener that referenced this issue Sep 28, 2018
… ranges. (#6)

* Does some refactoring to make functions shorter. They are doing some
funky mutable reference passing that we may want to eliminate in the future.
* Switches from Joinery to Itertools because  Itertools was able to handle
joining the result of flat_value_iterator, which is a FlatMap.
* The test report added here was manually modified, so the minimums,
totals, etc. are not correct. This was done to shorten the test
assertions and to get around an issue with string to double parsing
in serde_json (serde-rs/json#128) that made
assertions around the many-decimaled values unreliable.
@donbright
Copy link

donbright commented Dec 2, 2018

can we have the option to parse into some kind of decimal type? (as in, base 10 floating point)

@demurgos
Copy link

demurgos commented Apr 21, 2019

Hi,
What is the current state of support for floats?

I saw that ryu is a dependency but floats are still not round-tripping when using the examples from the README. I cannot find how to enable precise floats.

Edit: Following more tests, I may be wrong and round-trip works. It would still be nice to have an update for this issue.

@dtolnay
Copy link
Member Author

dtolnay commented Apr 21, 2019

The ryu crate is for printing floats to string, so it wouldn't help with the parsing side which is where loss of accuracy happens.

I am closing this issue and will open a new issue to hopefully make it clear that the remaining work is about parsing.

@dtolnay dtolnay closed this as completed Apr 21, 2019
@serde-rs serde-rs locked and limited conversation to collaborators Apr 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Development

No branches or pull requests

5 participants