New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output JSON schema during build process #176
Comments
This is tricky. It seems like Serde has all that information but in reality we don't. At compile time, we have type information about only one struct/enum at a time. For example we might know that your Conversely at runtime Serde deals with values only, not types. So there would be no way for Serde to come up with a JSON schema for a particular type unless we can get a concrete instance of that type from somewhere and try serializing it. And even then we would be blind to any shenanigans the type might try to do, like serializing itself as an integer if the day of the month is prime and serializing itself as a string otherwise. serde-rs/serde#345 is tracking a similar request. If this is a feature you need (rather than "wouldn't it be great if..."), I think a more promising place to start would be implementing this as a compiler plugin, similar to how Clippy works. I think they have a lot more type information at that stage than what we have in Serde. |
I'm not really familiar with the implementation of Serde, but from your description it sounds like you have enough information, you just don't have it all at once. There is a set of unconnected (from the point of view of Serde) points of analysis:
This isn't something I need in the long run, it's more something I want for Rust in the long run. Having something like this would really help make it even more compelling for web devs and server architects (not exclusively, just especially). If this is something that's really, truly, impossible for Serde, my next recommendation (if anyone is watching) is to use external mixins to call one of the JS, Python, or Ruby libraries that could get this done. But that would at best be a hack, and not really the most appropriate thing for Rust long term. I agree Clippy would be a good choice for linting (validating) against the schema. |
Referring to what @oli-obk said on serde-rs/serde#345:
Could you elaborate? |
This is the same thing I was getting at above:
In other languages like Java or Go, data serialization is typically built on runtime reflection which lets you do things like this at runtime:
Rust does not have runtime reflection.
Instead Serde serializes structs by generating code at compile time to serialize them.
Hopefully that clarifies the limitations of putting together a complete JSON schema at Serde's position in the stack. I didn't know about In very simplified form, compilation works like this:
That is why I suggested Clippy as a more promising starting point for this. Not that JSON schema should be built into Clippy, but that something at that stage would have all of the relevant information available and would be able to do the job better. It may make sense for the Serde team to own this functionality but basically none of what we already have is going to be helpful, so we would be starting from scratch whether it goes in Serde or into a separate library. |
Excellent explanation. Really clears things up! Love your personification of Rust. To summarize:
This leads me to some questions:
Thank you for bearing with me by the way. |
It would be possible to add a static method on the Even then, you can't do compile time checks, but you can simply write a unit test validating the format against a schema |
Serde has a "procedural macro" for nightly (will be stabilized in Rust 1.15 in February). This is the same phase as what used to be the serde_macros "compiler plugin," just the mechanics have changed in order to stabilize a part of it. My talk at the most recent SF meetup discusses how these work (start at 3:00).
@oli-obk can speak more to how compiler plugins work since he has contributed extensively to Clippy. My current understanding is that it would be better to just do it all in a compiler plugin. It should have all the same syntactic information plus also type information.
As @oli-obk responded above, this is equivalent to what Serde is doing and has all the same limitations.
No, just syntax-level information about the current crate. Think of it as nothing more than the textual source code of the current crate. |
Using a compiler plugin to spit out a schema for a type is almost trivial once you get a handle of the type. Clippy actually has a very rudimentary inspection lint that almost does that. I'd be happy to mentor any extensions to it |
@dtolnay Thank you for referring me to that video! I was looking for something more recent than the RustConf2016 I had seen. Was this video on TWiR? I hadn't seen it there... If it's ok, I have some questions about your talk:
|
@oli-obk Could you refer me to specific lint you're talking about? |
@dtolnay Thanks for pointing out Valico! I totally didn't know they had a schema building & validation. That really helps solve a chunk of this problem. I don't know if a direct port is the best thing though. If you're adding it to Serde, it's probably best to try and generalize to try and support schema for other languages. Is Serde's existing par/gen infrastructure something that would be helpful here? |
I moved it to https://github.com/dtolnay/talks/issues/1 to not derail the discussion here:balloon:.
Good call, but whether we generate JSON schema directly or a higher-level broadly applicable Serde schema, eventually we will need a way to get a JSON schema so I would rather reuse an existing high-quality implementation of that. |
The inspector lint is implemented here: https://github.com/Manishearth/rust-clippy/blob/master/clippy_lints/src/utils/inspector.rs |
I would be interested in seeing this handled by a separate crate dedicated to JSON schema. |
@lylemofitt @dtolnay FYI, I'm working on a crate for generating MongoDB-flavored JSON schemas: https://github.com/H2CO3/magnet — it's not generating 100% standard-compliant JSON schema because MongoDB's spec is more precise and powerful (and I need it for document validation), but it's close, and I think my approach is pretty reasonable should someone want to extend/build upon it. |
For future readers that seek a solution, I found this repository: |
It would be great if Serde could optionally produce a JSON schema as a side-effect of the build process. AFAIK it has all the information it needs to write one. You just need to translate the structs/enums to their appropriate schema representations (read: matching JSON type).
Additional:
While the above is an awesome starting block, it would also be really nice if you could compile-time check that Serde's JSON will validate against an externally provided schema. This isn't totally necessary, as you could do this after the fact with a tool like ajv. It would just provide stronger guarantees if it was compile-time checked.
Motivation
Anticipated Questions:
The text was updated successfully, but these errors were encountered: