Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser cannot read arbitrary precision numbers #18

Closed
apoelstra opened this issue Nov 5, 2015 · 21 comments
Closed

Parser cannot read arbitrary precision numbers #18

apoelstra opened this issue Nov 5, 2015 · 21 comments

Comments

@apoelstra
Copy link

http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf specifies in the second paragraph of the introduction that JSON is agnostic about numbers, and simply represents them as a series of digits.

However, serde-json parses anything with a decimal point as a Rust f64, which causes numbers to be read incorrectly. There is no way to avoid this because this behaviour is chosen as soon as a decimal point is encountered. This makes it impossible to use serde-json to interoperate with financial software using JSON.

@JohnHeitmann
Copy link

It'd be nice for both ergonomics and correctness if serde supported this.

But... how common are protocols that rely on this behavior? The two money protocols I'm familiar with, Stripe and Stockfighter, both use integer cents. I believe this is to avoid issues with implementations like Javascript's that are more hopelessly entangled with lost precision.

I'm trying to get a feel for priority, not arguing against the request.

@apoelstra
Copy link
Author

@JohnHeitmann Bitcoin Core, and things with API-compatibility with it, uses these numbers for monetary values. This is an historical accident but one that can't be changed now without breaking a lot of applications.

This is the only application I'm aware of, but it's a pretty big one.

@nubis
Copy link

nubis commented Apr 25, 2016

Hi, I'm going to try and send you a pull request dealing with this. You'll find the lost precision to be an issue with almost every bitcoin exchange API and Forex api's as well. Some even go the way of serializing decimals as JSON strings so it's more evident for the client/consumer.

And for most other cases, even when I don't know about the precision of the 0.00022 value being sent from JS, I don't want my rust to make it worse giving me a 0.00029999...

I'm familiar with a breadth of Ruby JSON libraries and the community seems to have agreed upon deserialising JSON floats as decimals by default, with an option to reverting to Floats.

Rust has a decimal crate since december 2015, looks very complete, and seems to be using num under the hood so they probably don't reinvent much. I'll try using that. https://crates.io/crates/decimal

Anyways just to let you know this is being looked at :)

@dtolnay
Copy link
Member

dtolnay commented Apr 25, 2016

@nubis Okay but once you parse it, what do you do with the result? The Visitor trait will require you to convert it to f64 in order to do anything.

You are going to need to fix serde-rs/serde#185 before you fix this.

@nubis
Copy link

nubis commented Apr 25, 2016

hehe, I was coming to that same conclusion by the time you wrote that.

This was referenced Apr 25, 2016
@nubis
Copy link

nubis commented Apr 29, 2016

Hi people, just to share some progress on this, implementing decimal. I've forked both serde https://github.com/nubis/serde and serde_json https://github.com/nubis/json

It took me a few nights to make sense of all the codebase and what goes where, but I finally got decimal deserialization to work.

I'm going to focus now on cleaning everything up, so I'm still a long way to go.

While working on it, and just to make it easier on myself I made it so that all numbers are initially parsed as decimals. As the decimal module has it's own internal parser it lifts some weight from serde which essentially just parses any '0'...'9' or '.'

I do think it's a good idea to start parsing as if it was an integer then scale up to a decimal once we're sure we need the precision, so I'm going to revert to something similar to what you had before, but using decimals instead of f64.

Then, I also think conversions between d128 and f64 are more lossy than they should so I'm going to check that too.

I'd really like some feedback if any of you can give it, I think d128 could be more integrated like any other primitive and its implementations could be using the existing macros.

@laktak
Copy link
Contributor

laktak commented Jul 19, 2016

just fyi - str.parse::<f64>() does a much better job than the current implementation and could be used as a stopgap measure. I separated it from the format check in Hjson.

@dtolnay
Copy link
Member

dtolnay commented Jul 19, 2016

@laktak how did you determine that str.parse::<f64>() does a better job? Are there some inputs on which it is more accurate? What are they?

I recently rewrote float parsing to be 40% faster and much more accurate in #110. Does that resolve the difference?

@laktak
Copy link
Contributor

laktak commented Jul 20, 2016

Sure, here's a simple test with the current versions:

    let input = r#"{
        "x1": -9876.543210,
        "x2": 0.14,
        "x3": 0.1401,
        "x4": 0.14004,
        "x5": 0.14005
    }"#;

    let json: serde_json::Value = serde_json::from_str(&input).unwrap();
    println!("{}", serde_json::to_string_pretty(&json).unwrap());

    let hjson: serde_hjson::Value = serde_hjson::from_str(&input).unwrap();
    println!("{:?}", hjson);
{
  "x1": -9876.543210000002,
  "x2": 0.14,
  "x3": 0.1401,
  "x4": 0.14004000000000003,
  "x5": 0.14005
}
{
  x1: -9876.54321
  x2: 0.14
  x3: 0.1401
  x4: 0.14004
  x5: 0.14005
}

For hjson I adopted your code to do the testing but then fed the string to str.parse::<f64>().

@dtolnay
Copy link
Member

dtolnay commented Jul 20, 2016

If by "current versions" you mean serde_json 0.7.4, that does not have my fixes. They will be in 0.8.0. On the current master branch I get the correct output.

@laktak
Copy link
Contributor

laktak commented Jul 20, 2016

Sorry, forgot to switch the branch. There are still issues with different numbers though. E.g.

    let input = r#"{
        "x1": 4094.1111111111113,
        "x2": 0.16666666666666666
    }"#;

prints

{
  "x1": 4094.1111111111115,
  "x2": 0.16666666666666667
}
{
  x1: 4094.1111111111113
  x2: 0.16666666666666666
}

@dtolnay
Copy link
Member

dtolnay commented Jul 20, 2016

Neither of those is a parsing issue - the numbers are parsed correctly but printed incorrectly. I can take a look later.

@dtolnay
Copy link
Member

dtolnay commented Jul 20, 2016

I filed Tencent/rapidjson#687 to follow up on the printing issue upstream.

@dtolnay
Copy link
Member

dtolnay commented Feb 11, 2017

I have an arbitrary precision implementation in #252. If you have been waiting for this feature, please take a look and let us know whether it addresses your use case.

The PR adds a cfg:

serde_json = { version = "0.9", features = ["arbitrary_precision"] }

that turns serde_json::Number and serde_json::Value into arbitrary precision types. If you parse some JSON to either of those types, serializing it back to JSON or calling to_string() will preserve exactly the original representation.

The implementation relies on specialization so will only work on nightly for now.

@dtolnay
Copy link
Member

dtolnay commented Feb 13, 2017

The PR is ready to merge but I would like to hear from at least two people that this implementation would be useful to them. I don't want to merge and maintain this if nobody will use it.

@alexreg
Copy link
Contributor

alexreg commented Feb 13, 2017

I would be happy even to help you maintain this, if you like. The code looks pretty clear now that I've gone over it.

@alexreg
Copy link
Contributor

alexreg commented Feb 13, 2017

Also, I think usage will increase hugely once specialisation gets stabilised. (I'm not sure when that's coming; next release perhaps?)

@oli-obk
Copy link
Member

oli-obk commented Feb 13, 2017

next release perhaps?

by the look from the tracking issue and the 2017 roadmap, I'd say next year at the earliest.

@alexreg
Copy link
Contributor

alexreg commented Feb 13, 2017 via email

@dtolnay
Copy link
Member

dtolnay commented Aug 23, 2017

#348 has a promising implementation with the same API as #252 but without relying on unstable specialization. It also interacts better with deserializer "adapters" like erased-serde, serde-ignored, and serde-encrypted-value which would not have worked with specialization.

@dtolnay
Copy link
Member

dtolnay commented Mar 28, 2018

The arbitrary_precision feature released in serde_json 1.0.13 adds support for dealing with arbitrary precision numbers through the serde_json::Number type.

@dtolnay dtolnay closed this as completed Mar 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

7 participants