Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add codegen expr magic for default, skip serializing, and alternative serializers #238

Merged
merged 9 commits into from Feb 21, 2016

Conversation

erickt
Copy link
Member

@erickt erickt commented Feb 16, 2016

This is an alternative implementation of #214, #198, #208 and implements #90 and #216. This allows one to write:

#[derive(Serialize, Deserialize)]
struct Struct {
    #[serde(default="123")]
    a: i32,
    #[serde(skip_serializing_if="self.b == 123")]
    b: i32,
    #[serde(serialize_with="(self.b == 123).serialize(serializer)"]
    c: i32,
    #[serde(deserialize_with="Ok(if try!(bool::deserialize(deserializer) { 123) } else { 0 })"]
    d: i32,
}

cc @oli-obk, @target-san, @arcnmx

@erickt
Copy link
Member Author

erickt commented Feb 16, 2016

One thing I can't tell is a great idea or not is that for #[serde(serialize_with="...", deserialize_with="...")] I introduce the variables serializer and deserializer, which seems a tad magical to me. It also could be used to violate the Serialize contract by pushing multiple times, but one could already do that if they write their own implementation. The only way I can think of that avoids introducing this name would be to have these expressions return a serializable/deserializable value, but that seems complicated, and I'm not even quite sure how we could even do it for deserialization. shrug

They are also a bit wordy, but I'm not sure if I want to impose my ser and de abbreviation here. I'd rather err on readability.

@erickt erickt added this to the v0.7.0 milestone Feb 16, 2016
@erickt
Copy link
Member Author

erickt commented Feb 16, 2016

Also would address #213.

@oli-obk
Copy link
Member

oli-obk commented Feb 16, 2016

on the readability part I'd prefer to only allow entering a function name. Code in strings is confusing to me.

cx,
&field.node.ty,
generics,
try!(parse_lit_into_expr(cx, name, lit)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this allow calling functions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it allows arbitrary expressions.

}
};

let expr = parse::parse_expr_from_source_str("<lit expansion>".to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this fails, the error reporting will be very confusing. is there a way to give a span?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I'll add it.

@oli-obk
Copy link
Member

oli-obk commented Feb 16, 2016

lgtm. Had to dig into the aster source to grok some of it. Are the aster-docs somewhere online?

@jnicholls
Copy link
Contributor

I agree with the "code in strings" bit. For the simple default value and boolean check expression, it seems natural to me. The serialize_with and deserialize_with seems more natural to be functions that are called with a &mut to the serializer/deserializer.

@erickt
Copy link
Member Author

erickt commented Feb 18, 2016

@oli-obk: Yeah, code in strings is a bit ugly. It feels worth it for default and skip_serializing_if, but complicated for serialize_with and deserialize_with. It'd be pretty simple to change this to functions, as in:

#[derive(Serialize, Deserialize)]
struct Struct {
    #[serde(default="default_a")]
    a: i32,
    #[serde(skip_serializing_if="skip_serializing_b")]
    b: i32,
    #[serde(serialize_with="serialize_with_c")]
    c: i32,
    #[serde(deserialize_with="deserialize_with_d")]
    d: i32,
}

fn default_a() -> i32 { 123 }

fn skip_serializing_b(value: &Struct) -> bool { value.b == 123 }

fn serialize_with_c<S>(serializer: &mut S,
                       value: &Struct) -> Result<(), S::Error>
    where S: serde::Serializer
{
    (value.b == 123).serialize(serializer)
}

fn deserialize_with_d<D>(deserializer: &mut D) -> Result<i32, S::Error>
    where D: serde::Deserializer
{
    Ok(if try!(bool::deserialize(deserializer) { 123) } else { 0 })
}

But it would only be slightly more typing to do this with expressions:

#[derive(Serialize, Deserialize)]
struct Struct {
    #[serde(default="default_a()")]
    a: i32,
    #[serde(skip_serializing_if="skip_serializing_b(self)")]
    b: i32,
    #[serde(serialize_with="serialize_with_c(serializer, self)")]
    c: i32,
    #[serde(deserialize_with="deserialize_with_d(deserializer)")]
    d: i32,
}

I expect we'd still have to parse out these strings as a path, so that we can reference other crates, and possibly bundle in types, a la #[serde(default="some::module::my_default::<u32>"] these strings could still be a tad complicated in generic types.

@erickt
Copy link
Member Author

erickt commented Feb 18, 2016

@oli-obk: The aster docs are here, but I haven't actually written any real documentation for it :(

This feature adds support for the default to be specified to be
some expression (which unfortunately needs to be parsed from
a string) without needing this value to have an implementation
of `Default`, or for a new-type wrapper in order to provide an
alternative implementation. This expression is run in a function,
and therefore has no access to any of the internal state of
the deserializer.
This allows end users to use an arbitrary expression to decide whether
or not to serialize some field. This expression has access to all the
fields in the struct, but none of the internal state of the Serialize
implementation. For structs, serde implements this by creating a
temporary trait and implementing the struct for it. For struct variants,
the fields are copied by reference into a temporary struct first
before implementing the temporary trait.

This also fixes a bug where the serde_codegen wasn't making calls to
Serializer::serialize_{tuple,struct}_variant{,_elt}.
This allows a field to be serialized with an expression instead
of the default serializer. This simplifies serializing a struct
or enum that contains an external type that doesn't implement
`serde::Serialize`. This expression is passed a variable
`serializer` that needs to be used to serialize the expression.
This allows a field to be deserialized with an expression instead
of the default deserializer. This simplifies deserializing a struct
or enum that contains an external type that doesn't implement
`serde::Deserialize`. This expression is passed a variable
`deserializer` that needs to be used to deserialize the expression.
@erickt
Copy link
Member Author

erickt commented Feb 19, 2016

I've updated the PR to pickup the recent nightly breakage, and to remove #[serde(skip_serializing_if_none, skip_serializing_if_empty)] annotations.

@oli-obk
Copy link
Member

oli-obk commented Feb 19, 2016

It'd be pretty simple to change this to functions, as in:

I suggest to simply make that what's shown in the examples and docs, even if more complex expressions are allowed.

@erickt
Copy link
Member Author

erickt commented Feb 20, 2016

@oli-obk: I've been talking it over with a few people, and I'm going to experiment with switching over to using the idents / paths to functions in order to avoid hygiene issues. Since we're breaking new ground here, maybe it'd be better to be a bit conservative. We could always switch back to full expressions in a future release.

@arcnmx
Copy link

arcnmx commented Feb 21, 2016

This is getting really really awkward fast. Take the use case of skip_serializing_if for example, a common approach is to check if the value holds some condition (like value.is_some()). You want to make a fast path to this, like making skip_serializing_if="Option::is_some" valid and allowing libraries to provide similar helper methods. If the value is a path and the argument is the whole structure, code reuse just goes completely out the window because you have to know which parameter you're referring to.

So we have a few solutions to that...

  • One approach is to take a path but change the prototype to fn skip_serializing_if(context: &Struct, value: &i32) -> bool. Then helper methods would provide generic functions that ignore the inferred type of Struct except in exceptional cases.
    • example becomes something along the lines of skip_serializing_if="SerdeSkipHelper::option_is_some"
  • Just don't allow access to the self context at all. Makes it all a lot easier and more straightforward.
    • example is valid, skip_serializing_if="Option::is_some"
  • Use an expression that evaluates to a bool, as this PR originally did. Let them specify self.b == whatever and be done with it. Also pretty straightforward, slightly ugly at having code in an attribute string, but simple enough that maybe it's okay.
    • example becomes skip_serializing_if="self.b.is_some()", having to refer to the value through self and the name of the field is a bit ugly though. Could introduce a magical value variable into scope but that's a bit messy.
  • Use my PR's hybrid approach of accepting an expression that evaluates into something that then gets called as a function. This allows you to specify a function path, but optionally use an expression where you would access self if necessary.
    • example is valid, skip_serializing_if="Option::is_some"
    • It also opens the possibility to using inline code via closures, but that seems a bit silly and should probably be discouraged: skip_serializing_if="|v| v.is_some() && self.not_b == 3"
    • may also allow you to optionally use self context in a sort of simple way, like skip_serializing_if="some_helper(&self.not_b)"
    • maybe instead of evaluating as a function we could use a trait that we blanket impl for FnOnce? This may help because impl'ing Fn* traits manually is not yet stable, and make some_helper example above work nicer.

Same arguments and approaches go for serialize_with. I think it's important to provide a short path to specify a helper that serializes the value and can ignore the overarching struct or the name of the particular field being serialized.

Now, deserialize_with is different because there is no self context to work with, but we would want to make it consistent with the serialize_with approach so there's no need to discuss it directly.

default is a bit special because the expression approach is quite useful. I feel that it's unique enough that we don't necessarily need to worry about it using the exact same approach as the other attributes and so can pick whichever works best. Our options there are...

  • Original expression approach: being able to use constants and values as in default="123" is nicer than having to pull it out into a one-off external function.
  • Path approach: default="Default::default" and so on. Somewhat annoying when you want to use a constant or literal value or whatever but not a dealbreaker.
  • Hybrid approach as in my old PR. Accepts paths but is actually an expression and so optionally allows you to do fancy things like default="|| 123" or default="Constant(123)".

@arcnmx
Copy link

arcnmx commented Feb 21, 2016

Note that the hybrid approach's goal is to essentially make everyone use paths, but in the exceptional case where you need access to the other fields of self it provides the ability to access them. It's also a backwards-compatible change that could be made on top of the path approach if we simply don't provide access to self initially.

@erickt
Copy link
Member Author

erickt commented Feb 21, 2016

Updated the patch to give better error messages if there's a syntax error in the expression. Still working on the other approaches.

@arcnmx: You enumerated it pretty well. I think your right that access to self probably does not make sense for these attributes, since depending on other properties in the structure is starting to err towards magical-ness. At that point, maybe it'd be better to implement a custom Serialize/Deserialize. Maybe for expressions, self should be the field, not the container. Or for paths, the argument to skip_serializing_if="Option::is_some" is the field, which would be more along the lines of your original PR.

@arcnmx
Copy link

arcnmx commented Feb 21, 2016

I think your right that access to self probably does not make sense for these attributes, since depending on other properties in the structure is starting to err towards magical-ness. At that point, maybe it'd be better to implement a custom Serialize/Deserialize.

Mm, though it could be useful in some cases (and it's something that's possible for us to expose). I think the reason why I entertain it is because it's awkward to implement Serialize manually, having to maintain the state machine and whatnot. Most of my own manual Serialize impls look more like this, and a way to avoid doing this is what motivated me to create the initial PR in the first place:

impl Serialize for X {
    fn serialize<S: Serializer>(&self, s: &mut S) -> Result<(), S::Error> {
        // Custom modifications go in here
        #[derive(Serialize)]
        struct Inner<'a> {
            field1: Newtype<'a>,
            field2: Option<&'a Y>,
        }

        Inner { field1: Newtype(&self.field1), field2: Some(&self.field2) }.serialize(s)
    }
}

... but maybe the pattern is rare enough that we shouldn't need worry about it; I'm kind of on the fence about it as I think access to self is potentially useful, but also a bit magical...

Either way, I'm leaning toward the path approach, with maybe default as the exception that allows expressions, maybe not. I don't mind if the argument to the path method is the struct itself as long as we also provide one argument to the field as well. We could also later on use SerdeSkipHelpers::* into scope so that skip_serializing_if="is_none" and skip_serializing_if="is_empty" are provided as common shortcuts.

Also as a random aside, would it make sense to instead name the attribute skip_serializing, where the default (no value provided) is just conceptually the equivalent of true. The _if seems slightly verbose, I guess it reads better though... Just thought I'd put it out there.

@erickt
Copy link
Member Author

erickt commented Feb 21, 2016

@arcnmx: I got to step out for a bit, but I just pushed up a PR that switches from expressions to paths. I'm currently just passing in the field. When do you think the default would want access to the whole struct?

As to Serialize, I'm really hoping that it won't be too difficult to implement small wrappers like what you've written. I don't see what you've written as being too difficult.

@arcnmx
Copy link

arcnmx commented Feb 21, 2016

I got to step out for a bit, but I just pushed up a PR that switches from expressions to paths. I'm currently just passing in the field.

Cool, I feel that's probably the best approach to take for now!

When do you think the default would want access to the whole struct?

default can't access the whole struct, the only two that can are skip_serializing_if and serialize_with. The only advantage of expressions in default is allowing use of literals. It feels slightly more convenient and flexible, but I don't have a strong opinion either way. It's somewhat inconsistent with skip_serializing_if so maybe paths are still better for that reason alone.

As to Serialize, I'm really hoping that it won't be too difficult to implement small wrappers like what you've written. I don't see what you've written as being too difficult.

Mm, it's not difficult. Doing it for every structure is a huge pain though, and why I made the PR to avoid newtypes in the first place! You'll have to fall back to that if you need access to the whole struct which (ideally) isn't nearly as common a situation as needing a newtype so it should be fine.

@erickt
Copy link
Member Author

erickt commented Feb 21, 2016

@arcnmx: I just realized we can't pass in the whole container into these path functions, because we are re-wrapping enum structs in a hidden struct, so there would be no way for a user to reference those types. So we would either need to have expressions, or only pass in field references to paths. My vote is for the latter in order to be hygienic.

Also I did think about skip_serializing=... but I didn't think that would express its Boolean-ness nature as well as skip_serializing_if.

@arcnmx
Copy link

arcnmx commented Feb 21, 2016

I just realized we can't pass in the whole container into these path functions, because we are re-wrapping enum structs in a hidden struct, so there would be no way for a user to reference those types. So we would either need to have expressions, or only pass in field references to paths. My vote is for the latter in order to be hygienic.

Hmm... Would it not be possible to thread an instance of &OriginalType through somehow? Seems like it should at least be possible, but eh, it seems slightly uglier than just using the field as the sole parameter anyway. Having to be generic over an unknown struct for the very few situations where it's needed seems not so worth it.

@erickt
Copy link
Member Author

erickt commented Feb 21, 2016

@arcnmx: We could try to pass in the original type, but then we'd also have to pass in some string to identify the specific variant. That seems a bit tricky to me. We could revisit this if an rfc like rust-lang/rfcs#1450 lands that exposes a concrete type for a specific variant.

@arcnmx
Copy link

arcnmx commented Feb 21, 2016

We could try to pass in the original type, but then we'd also have to pass in some string to identify the specific variant

Well, you would pass a reference to the whole type, and to the particular field. Then just let the implementation choose what it needs / wants to do. For the most part it would ignore the enum field entirely. It'd be more type-safe if we had enum variant types, sure, but I don't see it as a major problem without them.

@erickt
Copy link
Member Author

erickt commented Feb 21, 2016

@arcnmx: Yeah, I think we could revisit this if it turns out to be an issue.

@arcnmx
Copy link

arcnmx commented Feb 21, 2016

Either way I think just passing the field reference alone is fine for now. I'm fine with either (field) or (struct, field) but the former is a more conservative / less weird design, and serves the majority of use cases.

@erickt
Copy link
Member Author

erickt commented Feb 21, 2016

@arcnmx: I'm going to err with conservatism here. I'm going to merge this in. Thanks for the reviews!

@target-san
Copy link

Seems I'm late for the party.
Though I'll leave one consideration here.
You basically added one universal way to do most stuff. Which is nice. Though you removed any ways to shorten typing. Like, for each Option field, library user would need to type #[serde(skip_serializing_if="Option::is_none")]. And, unlike wrapper newtype approach, there's no way to reuse multiple similar annotations. Imagine typing in medium-sized protocol, with many optional fields.

@arcnmx
Copy link

arcnmx commented Feb 22, 2016

I was kind of hoping we could shorten it to skip_serializing="is_none", and supply a generic is_empty for the various collection and slice types. Maybe even a special ="type::some_method" syntax to avoid having to write out complex types. Or a =".some_method" syntax to imply automagic deref / method dispatch. Slices probably require skip_serializing_if="<[_]>::is_empty" right now which is a little bit awkward...

And, unlike wrapper newtype approach, there's no way to reuse multiple similar annotations.

Not sure what you mean here... You should be able to create helper functions and reuse those. Do you mean #[serde(skip_serializing_if="some_helper")] is just too much typing in general for every field?


Considering that this has been merged now we should probably discuss on a new issue about any ergonomic concerns with it.

rubdos pushed a commit to rubdos/serde that referenced this pull request Jun 20, 2017
traits: Remove pattern in trait's method signature

This use of `mut` was nonsensical for now, but is proposed to be
disallowed by rustc in the future (as a bugfix).

See rust-lang/rust/pull/37378 for context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants