MessagePack implementation for Rust / msgpack.org[Rust]

Overview

RMP - Rust MessagePack

RMP is a pure Rust MessagePack implementation.

Build Status Coverage Status

This repository consists of three separate crates: the RMP core and two implementations to ease serializing and deserializing Rust structs.

crates.io API Documentation
rmp RMP
rmps RMP Serde
rmpv RMP Value

Features

  • Convenient API

    RMP is designed to be lightweight and straightforward. There are low-level API, which gives you full control on data encoding/decoding process and makes no heap allocations. On the other hand there are high-level API, which provides you convenient interface using Rust standard library and compiler reflection, allowing to encode/decode structures using derive attribute.

  • Zero-copy value decoding

    RMP allows to decode bytes from a buffer in a zero-copy manner easily and blazingly fast, while Rust static checks guarantees that the data will be valid as long as the buffer lives.

  • Clear error handling

    RMP's error system guarantees that you never receive an error enum with unreachable variant.

  • Robust and tested

    This project is developed using TDD and CI, so any found bugs will be fixed without breaking existing functionality.

Requirements

  • Rust 1.34.2

Versioning

This project adheres to Semantic Versioning. However until 1.0.0 comes there will be the following rules:

  • Any API/ABI breaking changes will be notified in the changelog explicitly and results in minor version bumping.
  • API extending features results in patch version bumping.
  • Non-breaking bug fixes and performance improving results in patch version bumping.
Comments
  • Error when deserializing internally-tagged enum

    Error when deserializing internally-tagged enum

    Given an internally-tagged enum with NewType variants, serializing produces a valid value but deserializing fails. For example:

    #[derive(Serialize, Deserialize)]
    #[serde(tag = "type")]
    enum TestEnum {
        Foo(TestType),
        Bar(TestType),
    }
    
    #[derive(Serialize, Deserialize)]
    struct TestType {
      inner: u32,
    }
    
    let test_enum1 = TestEnum::Foo(TestType{inner: 123});
    let serialized = rmp_serde::to_vec(&test_enum1).unwrap();
    let test_enum2 = rmp_serde::from_slice(&serialized).unwrap();
    

    The serializer produces [146, 163, 70, 111, 111, 123], equivalent to the JSON value ["Foo", 123]. That's the value I was expecting.

    The deserializer fails with the error Syntax("invalid type: sequence, expected any value").

    Everything works if I remove the line #[serde(tag = "type")], but then the serializer produces [146, 0, 145, 145, 123], equivalent to the JSON value [0, [[123]]]. If possible I'd prefer to use the internally tagged representation.

    opened by alex-shapiro 25
  • Unchecked vector pre-allocation

    Unchecked vector pre-allocation

    When playing around with afl, I found the following example:

    extern crate rmpv;
    
    use std::io::Cursor;
    
    fn main() {
        let data = [219, 175, 142, 142, 201, 219, 128, 0, 50, 175, 142, 196, 100, 212, 185];
        let mut cursor = Cursor::new(data);
        let decoded = rmpv::decode::value::read_value(&mut cursor);
        println!("Done: {:?}", decoded);
    }
    

    It takes almost 200ms in release mode and 2min 47s in debug mode to run.

    opened by dbrgn 22
  • std::net::IpAddr fails to roundtrip using 0.14.1

    std::net::IpAddr fails to roundtrip using 0.14.1

    We recently updated from 0.14.0 to 0.14.1 (and to serde version 1.0.104) and the following minimal test case

     use std::net::{IpAddr, Ipv4Addr};
    
    use rmp_serde::{from_slice, to_vec};
    
    #[test]
    fn roundtrip_ip_addr() {
        let addr = IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1));
        let addr1: IpAddr = from_slice(&to_vec(&addr).unwrap()).unwrap();
    
        assert_eq!(addr1, addr);
    }
    

    started to fail with

    Syntax("invalid type: integer `0`, expected `V4` or `V6`")'
    
    opened by ghost 18
  • Make ReadRefReader.buf public

    Make ReadRefReader.buf public

    Hi, I want to deserialize stream of MessagePack structures with zero-copy.

    But when I create ReadRefReader with:

        let contents = std::fs::read(&path).unwrap();
        let mut deserializer = Deserializer::from_read_ref(&contents);
    

    ... and then deserialize a first struct:

        let msg = Message::deserialize(&mut deserializer)?;
    

    I have no way of knowing where did the deserializer end.

    However, if I would add this into the impl<'de, R> Deserializer<ReadRefReader<'de, R>>:

        pub fn get_buf(&self) -> &'de [u8] {
            self.rd.buf
        }
    

    ... and used it like this:

        let mut buf = &contents[..];
    
        let mut result = Vec::new();
    
        loop {
            let mut deserializer = Deserializer::from_read_ref(&buf[..]);
    
            if let Ok(event) = Message::deserialize(&mut deserializer) {
                result.push(event);
            }
            else {
                break;
            }
    
            buf = deserializer.get_buf();
        }
    
    

    It would work.

    So the question is - am I missing something? Is there a way how to deserialize streaming MessagePack structs from buffer reference? If not - would it be possible to add the get_buf() method? And if not... could you think of a better way how to implement this?

    opened by wbenny 12
  • Serialize enum unit variants as just their name or index

    Serialize enum unit variants as just their name or index

    From this comment:

    For c-style enums this serializes quite poorly.

    enum Foo {
       A,
       B,
       C
    }
    

    Resulting msgpack is {"A": None} where I expected "A". Can this be fixed before release or does this need another option, were I to open a PR?

    It generally seems like rmp's output cannot really be consumed from other languages.

    serde_json does this optimization, and it seems pretty easy: https://github.com/serde-rs/json/blob/master/src/ser.rs#L883-L890. If I recall correctly, we'll need to modify the deserializer to accept this kind of enum serialization too - but that shouldn't be too bad.

    For consistency, I think we should do this both when writing string variant names and when writing integer variant indices.

    I'm not sure how breaking this change would be with our policies. On an upgrade, it would cause the serializer to produce different output for the same enum, output which would not be accepted by older versions of this library. We could also just add this as an option, and make it the default in the next major version?

    opened by daboross 12
  • API break between rmp-serde 0.14.0 and 0.14.2?

    API break between rmp-serde 0.14.0 and 0.14.2?

    I've recently encountered compilation breakage when upgrading from 0.14.0 to 0.14.2:

    error[E0107]: wrong number of type arguments: expected 2, found 1
      --> ykpack/src/encode.rs:11:10
       |
    11 |     ser: Serializer<&'a mut dyn Write>,
       |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected 2 type arguments
    

    This is because Serializer had a new type parameter C added.

    Shouldn't 0.14.x have been bumped to 0.15.0 at the time this was introduced?

    opened by vext01 11
  • Utf8Error when parsing msgpack v4.0 encoded raw fields

    Utf8Error when parsing msgpack v4.0 encoded raw fields

    The previous version of the msgpack spec (v4.0) does not have the concept of UTF-8 strings - only a 'raw' type which admits any binary data. In the v5.0 revision the markers used by msgpack to indicate 'raw' data in v4.0 were repurposed (yes really) to mean UTF-8 strings, and a new set of markers for binary data were introduced; consequently, if you attempt to parse msgpack data from a 4.0 compliant implementation in strict accordance with the 5.0 spec, it is possible to receive Utf8Error:

    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Utf8Error(Utf8Error { valid_up_to: 5 })'
    

    Could we get a field attribute that allows string fields to be interpreted as binary so that old msgpack files can be parsed successfully?

    opened by awh 11
  • feat: add ext encoder for encoding enum variant names as strings

    feat: add ext encoder for encoding enum variant names as strings

    Edit: As of 2019-06-20, I've made some edits to this PR, and rewritten the main description. Original description is collapsed below.

    This adds VariantStringSerializer and VariantIntegerSerializer to modify the encoding of enum variants similar to how the current StructMapSerializer and StructTupleSerializer behave.

    The serializer code itself should be pretty straightforward. Each new serializer forwards all of its methods to the inner one, except those necessary to encode variants. For those, it will encode the variant, then forward to the corresponding non-variant method so that chaining StructMapSerializer with VariantStringSerializer works together correctly.

    For wrapping StructMapSerializer with VariantStringSerializer, I had to introduce a new trait. Since both serializers modify the Serializer::serialize_struct_variant method, they could not really forward to eachother properly using just the Serializer trait. SerializeParts is a new trait for doing sub-part serialization, and in particular it has a method for serializing just the ident of an enum variant.

    For deserializing this, I removed the code in VariantAccess which limited variants to be only integers. Since serde naturally allows for using strings to identify enums, nothing more was necessary.

    Fixes #172.

    Supersedes and closes #154.

    Original PR description Edit: This is an alternative to #180. #180 is significantly nicer, but is not backwards compatible.

    While this works, there are some drawbacks that I'm not sure about:

    • I had to remove some code in the VariantAccess decoding structure. It was hardcoded to only accept u32 for enum variants. Instead, it now forwards the deserialization directly to the underlying implementation so that the enum variant can be deserialized either with a u32 or a string.
    • Using with_struct_map or with_struct_tuple after with_string_variants will clobber its behavior. VariantStringSerializer correctly respects underlying struct encoding, but StructMapSerializer and StructTupleSerializer are both still hardcoded to use integer indices in places.

    One last note: the test round_trip_untagged_enum_with_enum_associated_data is currently failing, but that should be fixed by https://github.com/serde-rs/serde/pull/1438.

    opened by daboross 10
  • Signal safety

    Signal safety

    Some low- and high-level functions are signal unsafe. That means if Read or Write given returns an Err with EINTR there are no way to continue decoding/encoding properly.

    • [x] Investigate which functions and traits are signal unsafe by marking it in a documentation.
    • [x] Implement signal-safe alternatives if possible.
    T: bug C: lowlevel C: highlevel 
    opened by 3Hren 9
  • Transparent newtype struct representation

    Transparent newtype struct representation

    https://github.com/3Hren/msgpack-rust/issues/129

    Changed encoding of struct A(B) from [B] to B. This PR is for newtype struct only. Variants and units are harder, so I'd like to discuss it first.

    opened by SX91 8
  • Value isn't encodable/decodable

    Value isn't encodable/decodable

    I'm (slowly) working on #40, and I'm running into a problem with representing heterogenous lists (for arguments). I've resorting to using Vec<Value> as a struct field. However, since Value isn't encodable, it's difficult to serialize it as part of a larger struct.

    T: enhancement C: decode P: normal V: minor 
    opened by euclio 8
  • Incorrect docs: DefaultConfig

    Incorrect docs: DefaultConfig

    https://github.com/3Hren/msgpack-rust/blob/941441f484c325464c9746fdca9d11c1ca6a2d78/rmp-serde/src/config.rs#L58

    It seems it doesn't actually write enum variants by index, it uses idents as strings to prefix enum variant.

    Impl: https://github.com/3Hren/msgpack-rust/blob/941441f484c325464c9746fdca9d11c1ca6a2d78/rmp-serde/src/config.rs#L87-L97

    opened by e-ivkov 0
  • Serializing HashMap<Vec<u8>,rmpv::Value>

    Serializing HashMap,rmpv::Value>

    hello, I would like to ask if any of you could help me write a serializer for HashMap<Vec<u8,rmpv::Value>? because when I use .to_vec() it stores keys as a list of ints and this is not good or backwards compatible

    opened by bobi6666 1
  • serde::to_vec(rmpv::Value) works unexpected

    serde::to_vec(rmpv::Value) works unexpected

    Hi! Looks like rmp_serde::to_vec works not properly for rmpv::Value with map inside it. I'm expected that result is encoded message pack vector, but result is a map. For example:

    #[test]
    fn serialize_value_as_map() {
        let value = rmpv::Value::Map(vec![
            (rmpv::Value::from("a"), rmpv::Value::from(1)),
            (rmpv::Value::from("b"), rmpv::Value::from(2)),
        ]);
        let encoded = rmp_serde::to_vec(&value).unwrap();
    
        assert_eq!(vec![130, 161, 97, 1, 161, 98, 2], encoded);
    }
    
    opened by godzie44 0
  • Serialize enumeration as index

    Serialize enumeration as index

    Hello,

    I noticed that between versions 0.15.4 and 1.0.0 the enumerations are serialized by name rather than index.

    Is it possible by configuration to achieve back the old behavior?

    opened by woodworker7 0
  • Can't handle #[serde(tag =

    Can't handle #[serde(tag = "someKey")] for structure while Deserialization

    use rmp_serde::Serializer;
    use serde::Serialize;
    
    extern crate rmp_serde;
    extern crate serde;
    #[macro_use]
    extern crate serde_derive;
    
    #[derive(Debug, Serialize, Deserialize)]
    #[serde(rename_all = "snake_case")]
    struct  A {
        b: B,
    }
    
    #[derive(Debug, Serialize, Deserialize)]
    #[serde(tag = "kind", content = "data")]
    #[serde(rename_all = "snake_case")]
    pub enum B {
        First(i32),
        Second(Messagedata)
    }
    
    #[derive(Debug, Serialize, Deserialize)]
    #[serde(tag = "someKey")]
    pub struct Messagedata {
        status: String,
        message: String,
    }
    
    fn main() {
        let a = A {
            b: B::Second (Messagedata { status: "Failure".to_string(), message: "You are not getting the things".to_string() }),
        };
        //SERDE
        let string_data = serde_json::to_string(&a).unwrap();
        println!("data: {:?}", string_data);
        let data = serde_json::from_str::<A>(&string_data).unwrap();
        println!("A: {:?}", data);
        println!("-------------------------------------------------------------------------------");
    
        //MESSAGEPACK
        let mut buf = Vec::new();
        a.serialize(&mut Serializer::new(&mut buf)).unwrap();
        // println!("a: {:?}", buf);
        let a = rmp_serde::from_slice::<A>(&buf).unwrap();
        println!("a: {:?}", a);
    }
    
    

    Here is error I am getting:

    data: "{\"b\":{\"kind\":\"second\",\"data\":{\"someKey\":\"Messagedata\",\"status\":\"Failure\",\"message\":\"You are not getting the things\"}}}"
    A: A { b: Second(Messagedata { status: "Failure", message: "You are not getting the things" }) }
    -------------------------------------------------------------------------------
    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: LengthMismatch(2)', src/main.rs:45:46
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    

    There is no problem with serializing and deserializing with serde. With rmp_serde there is no problem in serializing a struct with a given tag but while deserializing it is not able to take tag provided as key and giving length error.

    opened by harshkumar314e 1
  • Can't deserialize entire file

    Can't deserialize entire file

    I can't deserialize an entire file because the Deserializer does not implement into_iter as other serde libraries do.

    How can I get around this?

    Code thus far is:

    fn main() -> Result<(), Box<dyn std::error::Error>> {
        let file_path = "./src/foo.msgpack";
        let reader = BufReader::new(File::open(file_path).unwrap());
        let writer = BufWriter::new(File::create("./src/results.json").unwrap());
    
        let mut deserializer = rmp_serde::Deserializer::from_read(reader);
    
        // let mut serializer = serde_json::Serializer::new(io::stdout());
        let mut serializer = serde_json::Serializer::pretty(writer);
    
        serde_transcode::transcode(&mut deserializer, &mut serializer).unwrap();
        serializer.into_inner().flush().unwrap();
    
        Ok(())
    }
    
    opened by StuartHadfield 8
Releases(rmp-serde/v0.14.4)
  • rmp-serde/v0.14.4(Jul 24, 2020)

  • rmp-serde/v0.8.1(Oct 3, 2015)

    Changed

    • Upper limit for serde version.

    Fixed

    • Use the most effective int encoding Even if the value is explicitly marked as i64 it must be encoded using the most effective bytes representation despite of signed it or unsigned.
    Source code(tar.gz)
    Source code(zip)
  • rmp/v0.7.1(Sep 11, 2015)

    Changed

    • Use to_owned instead of to_string while converting ValueRef into Value. This change improves ValueRef::to_owned() method performance by approximately 10-20%.

      Also after this commit it's cheaper to decode directly into ValueRef with further converting to owned value rather than decoding directly into Value.

    Source code(tar.gz)
    Source code(zip)
  • rmp-serde/v0.8.0(Sep 11, 2015)

    Changed

    • Serializer can now be extended with custom struct encoding policy.
    • Improved error types and its messages for serialization part.
      • New error type introduced - UnknownLength. Returned on attempt to serialize struct, map or serquence with unknown length (Serde allows this).
      • The new type is returned if necessary.

    Fixed

    • Deserializer now properly works with enums.
    • Options with default values (that can be initialized using unit marker) deserialization. This fix also forbids the following Option deserialization cases:
      • Option<()>.
      • Option<Option<...>>. It's impossible to properly deserialize the listed cases without explicit option marker in protocol.
    • Serializer now properly serializes unit structs. Previously it was serialized as a unit (nil), now there is just an empty array ([]).
    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Aug 17, 2015)

    Added

    • Initial support for Serde serializer and deserializer.
    • Efficient bytes serialization with Serde.
    • Efficient binaries deserialization with Serde using ByteBuf.
    • Rust serialize Decoder now can provide the underlying reader both by reference or by value, destroying itself in the last case.

    Changed

    • Update well-formness for BigEndianRead trait to be implemented only for sized types.
    • Renamed PositiveFixnum marker to FixPos.
    • Renamed NegativeFixnum marker to FixNeg.
    • Renamed FixedString marker to FixStr.
    • Renamed FixedArray marker to FixArray.
    • Renamed FixedMap to FixMap.
    • Minor documentation updates and markdown fixes.
    Source code(tar.gz)
    Source code(zip)
  • v0.5.1(Aug 10, 2015)

    Changed

    • Now the rustc_serialize::Encoder should encode signed integers using the most effective underlying representation.
    • Now the rustc_serialize::Decoder should properly map integers to the result type if the decoded value fits in result type's range.
    Source code(tar.gz)
    Source code(zip)
Owner
Evgeny Safronov
ex-@yandex, now @sonm-io
Evgeny Safronov
A Gecko-oriented implementation of the Encoding Standard in Rust

encoding_rs encoding_rs an implementation of the (non-JavaScript parts of) the Encoding Standard written in Rust and used in Gecko (starting with Fire

Henri Sivonen 284 Dec 13, 2022
Rust implementation of CRC(16, 32, 64) with support of various standards

crc Rust implementation of CRC(16, 32, 64). MSRV is 1.46. Usage Add crc to Cargo.toml [dependencies] crc = "2.0" Compute CRC use crc::{Crc, Algorithm,

Rui Hu 120 Dec 23, 2022
PROST! a Protocol Buffers implementation for the Rust Language

PROST! prost is a Protocol Buffers implementation for the Rust Language. prost generates simple, idiomatic Rust code from proto2 and proto3 files. Com

Dan Burkert 17 Jan 8, 2023
Rust implementation of Google protocol buffers

rust-protobuf Protobuf implementation in Rust. Written in pure rust Generate rust code Has runtime library for generated code (Coded{Input|Output}Stre

Stepan Koltsov 2.3k Dec 31, 2022
A fast, performant implementation of skip list in Rust.

Subway A fast, performant implementation of skip list in Rust. A skip list is probabilistic data structure that provides O(log N) search and insertion

Sushrut 16 Apr 5, 2022
Pure Rust port of CRFsuite: a fast implementation of Conditional Random Fields (CRFs)

crfs-rs Pure Rust port of CRFsuite: a fast implementation of Conditional Random Fields (CRFs) Currently only support prediction, model training is not

messense 24 Nov 23, 2022
A binary encoder / decoder implementation in Rust.

Bincode A compact encoder / decoder pair that uses a binary zero-fluff encoding scheme. The size of the encoded object will be the same or smaller tha

Bincode 1.9k Dec 29, 2022
rust-jsonnet - The Google Jsonnet( operation data template language) for rust

rust-jsonnet ==== Crate rust-jsonnet - The Google Jsonnet( operation data template language) for rust Google jsonnet documet: (http://google.github.io

Qihoo 360 24 Dec 1, 2022
A Rust ASN.1 (DER) serializer.

rust-asn1 This is a Rust library for parsing and generating ASN.1 data (DER only). Installation Add asn1 to the [dependencies] section of your Cargo.t

Alex Gaynor 85 Dec 16, 2022
Encoding and decoding support for BSON in Rust

bson-rs Encoding and decoding support for BSON in Rust Index Overview of BSON Format Usage BSON Values BSON Documents Modeling BSON with strongly type

mongodb 304 Dec 30, 2022
Rust library for reading/writing numbers in big-endian and little-endian.

byteorder This crate provides convenience methods for encoding and decoding numbers in either big-endian or little-endian order. Dual-licensed under M

Andrew Gallant 811 Jan 1, 2023
Cap'n Proto for Rust

Cap'n Proto for Rust documentation blog Introduction Cap'n Proto is a type system for distributed systems. With Cap'n Proto, you describe your data an

Cap'n Proto 1.5k Dec 26, 2022
Character encoding support for Rust

Encoding 0.3.0-dev Character encoding support for Rust. (also known as rust-encoding) It is based on WHATWG Encoding Standard, and also provides an ad

Kang Seonghoon 264 Dec 14, 2022
A CSV parser for Rust, with Serde support.

csv A fast and flexible CSV reader and writer for Rust, with support for Serde. Dual-licensed under MIT or the UNLICENSE. Documentation https://docs.r

Andrew Gallant 1.3k Jan 5, 2023
A HTTP Archive format (HAR) serialization & deserialization library, written in Rust.

har-rs HTTP Archive format (HAR) serialization & deserialization library, written in Rust. Install Add the following to your Cargo.toml file: [depende

Sebastian Mandrean 25 Dec 24, 2022
A HTML entity encoding library for Rust

A HTML entity encoding library for Rust Example usage All example assume a extern crate htmlescape; and use htmlescape::{relevant functions here}; is

Viktor Dahl 41 Nov 1, 2022
pem-rs pem PEM jcreekmore/pem-rs [pem] — A Rust based way to parse and encode PEM-encoded data

pem A Rust library for parsing and encoding PEM-encoded data. Documentation Module documentation with examples Usage Add this to your Cargo.toml: [dep

Jonathan Creekmore 30 Dec 27, 2022
tnetstring serialization library for rust.

TNetStrings: Tagged Netstrings This module implements bindings for the tnetstring serialization format. API let t = tnetstring::str("hello world"); le

Erick Tryzelaar 16 Jul 14, 2019
A TOML encoding/decoding library for Rust

toml-rs A TOML decoder and encoder for Rust. This library is currently compliant with the v0.5.0 version of TOML. This library will also likely contin

Alex Crichton 1k Dec 30, 2022