I really don't understand serde custom deserialization #3
Comments
Happy to take a look. |
Thanks. I’ve got the hang of serialisation types but I just don’t see how deserialisation keeps state and knows which methods to call. If I could see what the derived macro code was doing I would probably be able to customise it, but I don’t know how to get at that either. We need counted arrays, offsets and some other tricks (see #2). Have a look at the Counted module in otspec/types.rs and the Font struct in src/font.rs for how I’ve done the serialisation. |
You can get the derived macro code with I'm looking now, and I'm also a bit confused; hard to keep track of what is supposed to be in the A potentially useful reference I'm looking through now is the To check my understanding: what I would expect here is that we are writing out a |
It’s the same question: what should be done by the Serialize trait and what should be done by the Serializer? Because our data layout is specific to the data we’re serialising there’s overlap. What I have done so far is to tell the Serialize trait that a Counted array maps to a serde seq, and tell the Serializer that a seq is written out with the length first. This seems to be how other implementations do it, and serialize_seq takes a length probably so that it can be written out first. I notice that, like the deserializer, calling a method on the serializer causes a move, so you can’t serialize twice in the same method. (ie if you try calling serialize_u16 to emit the count you then can’t serialize the vec.) serialize_seq returns a handle that can serialize multiple elements, so that was another hint that I should go in that direction. |
Yea, I think you're on the right track. I have some progress here: excuse the code-dump but I'm going to step out shortly. This is coming fairly close, but the test is failing the equality check; won't have time to dig into that today, unfortunately. :( pub mod Counted {
use serde::de::{SeqAccess, Visitor};
use serde::ser::SerializeSeq;
use serde::Serialize;
use serde::{Deserialize, Deserializer, Serializer};
pub fn serialize<S, T>(v: &[T], serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
T: Serialize,
{
let mut my_seq = serializer.serialize_seq(Some(v.len()))?;
my_seq.serialize_element(&(v.len() as u16));
for k in v {
my_seq.serialize_element(&k)?;
}
my_seq.end()
}
pub fn deserialize<'de, D, T: serde::Deserialize<'de>>(d: D) -> Result<Vec<T>, D::Error>
where
D: Deserializer<'de>,
{
d.deserialize_seq(SeqVisitor::new())
}
struct SeqVisitor<T> {
len: usize,
_phantom: std::marker::PhantomData<T>,
}
impl<T> SeqVisitor<T> {
fn new() -> Self {
SeqVisitor {
len: 0,
_phantom: std::marker::PhantomData,
}
}
}
impl<'de, T> Visitor<'de> for SeqVisitor<T>
where
T: serde::Deserialize<'de>,
{
type Value = Vec<T>;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(formatter, "A sequence of {} values", self.len)
}
fn visit_seq<A: SeqAccess<'de>>(mut self, mut seq: A) -> Result<Self::Value, A::Error> {
self.len = seq
.next_element::<u16>()?
.ok_or_else(|| serde::de::Error::custom("Count type must begin with length"))? as usize;
let mut result = Vec::with_capacity(self.len);
for i in 0..self.len {
let next = seq
.next_element::<T>()?
.ok_or_else(|| serde::de::Error::invalid_length(i, &self))?;
result.push(next)
}
Ok(result)
}
}
} |
Thank you so much! I just added a small test case for the Counted module, and that works, so the problem is somewhere else. I think I can work on that from here. Thanks again. |
Stupid cut-and-paste error, I was deserialising an F2DOT14 with a I32Visitor. |
great, happy to hear. I definitely stared at this blankly for a long time before it clicked; it was actually in the process of writing a response about how I was confused and couldn't figure it out that I realized my mistake. Happy to see continued progress on this, it looks like it could be a really useful utility! |
Yep. I now have (mostly) Font objects serializing and deserialising. I'm hacking around the offset issue for now but I do want to get that sorted. |
What's the offset issue? |
In lots of cases, we need to be able to read data potentially out-of-order somewhere else in the big data structure, based on an offset value stored in the struct. e.g. in the top-level table: struct TableRecord {
tag: Tag,
checksum: uint32,
offset: uint32,
length: uint32,
} But I can't find a way to tell serde "go to this position in the stream and read this many bytes". |
Mainly because (a) I can't pass any data into a |
right, this is definitely a tricky thing to fit into the serde model. I'll let it stew for a bit, there may be some more or less hacky solutions. One might be to deserialize to an intermediate helper type, and then process that into your final representation afterwards, something like: struct cmap {
header: CmapHeader,
tables: Vec<CmapTable>,
}
#[derive(Deserialize)]
struct CmapHelper<'a> {
header: CmapHeader,
table_data: &'a [u8],
}
impl<'de> Deserialize<'de> for cmap {
fn deserialize<D>(deserializer: D) -> Result<cmap, D::Error>
where
D: Deserializer<'de>,
{
let helper = CmapHelper::deserialize(d)?;
let mut tables = Vec::with_capacity(helper.records.len());
for record in &helper.header.records {
tables.push(CmapTable::decode(record, helper.header.table_data)?);
}
Ok(cmap { header: helper.header, tables })
}
} |
I need to write a custom deserializer for Counted in types.rs and I just don't know how to do it. :-/
The text was updated successfully, but these errors were encountered: