Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about enum deserialization #782

Closed
vedantroy opened this issue Jun 24, 2021 · 2 comments
Closed

Questions about enum deserialization #782

vedantroy opened this issue Jun 24, 2021 · 2 comments

Comments

@vedantroy
Copy link

I've been working on fixing the enum implementation in msgpack-rust: 3Hren/msgpack-rust#225.

I was looking at the implementation of enum deserialization here:

json/src/de.rs

Line 1837 in df1fb71

fn deserialize_enum<V>(

and tried to copy it for msgpack-rust.

My attempt is here:
https://github.com/vedantroy/msgpack-rust/blob/2da1609d9d66aee94e80b3640d5a3c0d93a6a8a7/rmp-serde/src/decode.rs#L639

Quick context:
In msgpack-rust, enums are currently serialized as maps with a single key / value, where the key = an integer / string, and the value is any associated data. I'm changing it so that if enums are unit variants, then they aren't serialized as maps.

Explanation of the code:
The code above checks if there is a map, if there is, it will deserialize the enum using VariantAccess, which will handle any associated data, otherwise, we deserialize the enum as a unit variant using UnitVariantAccess (I should be doing a check here to see whether the "marker" is a valid integer / string). The problem is that take_or_read_marker consumes the token, (unlike in serde_json, where parse_whitespace does not consume the token), in the UnitVariantAccess case, this causes variant_seed to try to deserialize what is after the enum into an enum variant, which causes a bug.

https://github.com/vedantroy/msgpack-rust/blob/2da1609d9d66aee94e80b3640d5a3c0d93a6a8a7/rmp-serde/src/decode.rs#L861

An example:

We have the enum: enum Foo { A, B }.
The MessagePack data buffer is [0, 1]. visit_enum reads the 0 "marker" and calls visitor.visit_enum(UnitVariantAccess::new(self)), but the reader is now right before the 1, it then calls UnitVariantAccess::variant_seed, which will end up deserializing the 1 into variant B and leaving the reader at the end of the buffer. When in reality, we wanted to deserialize an A and leave the reader after the 0 but before the 1.

Was wondering if you had any thoughts on what to do.

Hope this makes sense!

@vedantroy vedantroy reopened this Jun 24, 2021
@vedantroy
Copy link
Author

Leaving this open -- b/c not sure if this is the right way to do things. I understand this question is pretty vague though -- so feel free to close whenever.

@dtolnay dtolnay closed this as completed Jan 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants