Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow increasing recursion limit #334

Closed
Yoric opened this issue Jun 14, 2017 · 23 comments
Closed

Allow increasing recursion limit #334

Yoric opened this issue Jun 14, 2017 · 23 comments

Comments

@Yoric
Copy link

Yoric commented Jun 14, 2017

I am parsing/serializing pretty large JSON files and I regularly encounter RecursionLimitExceeded. I need a way to instantiate a Serializer/Deserializer with a much larger recursion limit.

Could we introduce code to let us tweak that?

@dtolnay
Copy link
Member

dtolnay commented Jun 15, 2017

Seems reasonable. What do you think the API should look like?

@dtolnay
Copy link
Member

dtolnay commented Sep 3, 2017

While we iterate on the API in #346, have you been able to figure out the appropriate recursion limit for your use case? If it is not too much more than the current default, we can raise the default without yet making it configurable.

@Yoric
Copy link
Author

Yoric commented Sep 3, 2017

I believe that 1024 should be largely sufficient for my use cases.

Yoric added a commit to Yoric/json that referenced this issue Sep 4, 2017
Benchmarks indicate that changing `remaining_depth` from `u8` to
`usize` doesn't cause any performance regression (on my computer, it
actually marginally improves performance), so we perform this change
and take the opportunity to bump the value to a more generous depth of 1024.

Resolves serde-rs#334 - Implementing Serializer::with_recursion_limit

Resolves serde-rs#334 - Implementing Serializer::with_recursion_limit
Yoric added a commit to Yoric/json that referenced this issue Sep 4, 2017
Benchmarks indicate that changing `remaining_depth` from `u8` to
`usize` doesn't cause any performance regression (on my computer, it
actually marginally improves performance), so we perform this change
and take the opportunity to bump the value to a more generous depth of 1024.

Resolves serde-rs#334 - Implementing Serializer::with_recursion_limit
Yoric added a commit to Yoric/json that referenced this issue Sep 7, 2017
@dtolnay
Copy link
Member

dtolnay commented Sep 8, 2017

Copied from the discussion in #346 -- the most important thing we still need to figure out here:

How to reconcile with an eventual serde_json::Format. I would feel more comfortable if we make more progress on serde-rs/serde#644 before merging this. That issue will involve some data structure that describes JSON as a format, including whether to use indentation when serializing and presumably the recursion limit for deserializing. It may turn out to be a better API to treat that data structure as a builder for serde_json::Deserializer, rather than having builder methods directly on Deserializer.

@remexre
Copy link

remexre commented Feb 13, 2018

Any progress/changes here? I've got a project (https://github.com/remexre/extlint) that's storing an OCaml AST in JSON; since in OCaml's AST, a list is transformed into a series of conses (i.e. [1; 2; 3] and 1 :: (2 :: (3 :: ([]))) have the same AST representation), which blows the recursion limit for a list literal of length 50.

I'm fine manually increasing the stack size, but being able to handle this is a high priority. Is the best solution still to apply #346 to the current master and add a note in my README that we use this custom version?

@Yoric
Copy link
Author

Yoric commented Feb 21, 2018

This problem has caused me to switch to json-rust. Doesn't do nearly as many things as serde json, but it doesn't share this specific issue.

@remexre
Copy link

remexre commented Feb 21, 2018

Yeah, I need serde for the project though, since I'm essentially transforming JSON -> Rust datastructure -> Custom serialization and I don't wanna write a custom Serialize for each AST node type.

@RReverser
Copy link

Would integrating https://github.com/alexcrichton/stacker help more reliably here perhaps? (by using stacker::maybe_grow when recursing into objects/arrays)

@dtolnay
Copy link
Member

dtolnay commented Jan 11, 2019

Looks promising. I would be ready to consider a PR that incorporates stacker behind a cfg.

@RReverser
Copy link

RReverser commented Jan 11, 2019

I've tried that in a local branch, and it mostly works even with ridiculously large JSON nesting used for a test, but then stack overflow occurs deep within drop_in_place call chain when parsed Value gets deallocated.

I could add a similar guard for these deallocations, but given that user might and, most often, will parse to own data structures, and there is lots of other ways to create overly deep data structures, I'm not sure if it should be a responsibility of serde-json to handle stack overflows in Drop?

@Yoric
Copy link
Author

Yoric commented Jan 11, 2019

If necessary, how hard would it be to reimplement drop to avoid recursion?

@RReverser
Copy link

It wouldn't be too hard, and it should be trivial-ish to provide a newtype wrapper that would take care of this, but users would need to use it themselves in own data structures. Hence my concern about whether this belongs to serde-json or not.

@remexre
Copy link

remexre commented Jan 11, 2019

In theory the drop method could delegate to a depth_limited_drop which replaces children with Null down to a certain recursion limit, pushing further elements into a queue; I can sketch something up in a bit.

@RReverser
Copy link

As I said, it's not really a problem to implement this for Value at all, but fixing it is not going to help with custom datastructures, so I'm more inclined to leave this up to the user.

@RReverser
Copy link

Also, while this fails for nesting of 64K depth, it helps with e.g. 8K depth where previously it would fail, so I guess this is still worth it.

@RReverser
Copy link

Btw, looks like there's already a separate issue for deep Value dropping: #440

@Yoric
Copy link
Author

Yoric commented Jan 11, 2019

As I said, it's not really a problem to implement this for Value at all, but fixing it is not going to help with custom datastructures, so I'm more inclined to leave this up to the user.

I'm not sure I follow. What kind of custom data structures are you talking about?

@remexre
Copy link

remexre commented Jan 11, 2019

If you're deserializing a 64k-deep JSON object, you're probably deserializing into a 64k-deep-nested non-JSON value.

@RReverser
Copy link

I'm not sure I follow. What kind of custom data structures are you talking about?

Arbitrary user-defined data structures implementing Deserialize (either via derive or manually).

@RReverser
Copy link

If you're deserializing a 64k-deep JSON object, you're probably deserializing into a 64k-deep-nested non-JSON value.

Not sure what was that a response to, but yes.

@RReverser
Copy link

Note that deeply nested data structures would also have problems not only with Drop but also with Display / Debug / PartialEq / PartialOrd ... and lots of other recursive methods, so that's why I'm arguing that this is best left to the user, and serde-json shouldn't do anything special (since, aside from parsing, there is lots of other ways to create such problematic instances).

@RReverser
Copy link

Hmm now that it's implemented I'm starting to wonder if a safer approach would be not growing stack indefinitely, but merely detect how much is left w/o a hardcoded depth value. This would allow users to configure their stack size via RUST_MIN_STACK or a thread builder as they normally would to prevent other stack overflows, and would still safely stop parsing when there is not enough stack left.

@dtolnay
Copy link
Member

dtolnay commented Feb 1, 2019

Fixed in 1.0.38: Deserializer::disable_recursion_limit

@dtolnay dtolnay closed this as completed Feb 1, 2019
ivan-aksamentov added a commit to nextstrain/nextclade that referenced this issue Oct 11, 2022
Resolves #1018

When tree JSON is sufficiently nested, `serde_json` fails to parse it into the `AuspiceTree` `struct`  with:

```
recursion limit exceeded at line <line> column <column>
```

This problem has been reported before in `serde_json` in serde-rs/json#334, and the solution allowing to remove the recursion limit was implemented in the form of [`Deserializer::disable_recursion_limit`](https://docs.rs/serde_json/latest/serde_json/struct.Deserializer.html#method.disable_recursion_limit) traded off to the increased risk of overflowing the call stack.

So here I implemented the suggested solution, by calling the `disable_recursion_limit()` on deserializer. I am also mitigating the risk of stack overflow by using growing stack adapter from [`serde_stacker`](https://docs.rs/serde_stacker/latest/serde_stacker/).

This allows to successfully parse the large tree from #1018, although probably also makes the tree parsing a bit slower. But it is not critical for overall runtime, as it is a one-off operation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants