Rust library for reading/writing numbers in big-endian and little-endian.

Overview

byteorder

This crate provides convenience methods for encoding and decoding numbers in either big-endian or little-endian order.

Build status

Dual-licensed under MIT or the UNLICENSE.

Documentation

https://docs.rs/byteorder

Installation

This crate works with Cargo and is on crates.io. Add it to your Cargo.toml like so:

[dependencies]
byteorder = "1"

If you want to augment existing Read and Write traits, then import the extension methods like so:

use byteorder::{ReadBytesExt, WriteBytesExt, BigEndian, LittleEndian};

For example:

use std::io::Cursor;
use byteorder::{BigEndian, ReadBytesExt};

let mut rdr = Cursor::new(vec![2, 5, 3, 0]);
// Note that we use type parameters to indicate which kind of byte order
// we want!
assert_eq!(517, rdr.read_u16::<BigEndian>().unwrap());
assert_eq!(768, rdr.read_u16::<BigEndian>().unwrap());

no_std crates

This crate has a feature, std, that is enabled by default. To use this crate in a no_std context, add the following to your Cargo.toml:

[dependencies]
byteorder = { version = "1", default-features = false }

Alternatives

Note that as of Rust 1.32, the standard numeric types provide built-in methods like to_le_bytes and from_le_bytes, which support some of the same use cases.

Comments
  • Add a method to read the first bytes of a float (and assume the rest are 0)

    Add a method to read the first bytes of a float (and assume the rest are 0)

    I'm not sure if this is broadly useful enough to be worth adding to the library, but I've found myself using a similar function to this several times in a recent project so I thought I'd submit a PR in case you wanted it.

    The idea is to read the first n bytes of an f64 as a uint and then assume that the rest of the bytes are 0. This is very useful if you're parsing lots of compressed floats in little-endian format where the low bits (which are often zero) can be dropped (eg. due to a form of run length encoding, or other compression that drops sequences of zeros).

    opened by SamWhited 30
  • Improve WriteBytesExt API

    Improve WriteBytesExt API

    At the moment the byteorder crate uses a enum to generalize over the endianness and the method name to select the type. Both is not very ergonomic to use. I propose the following interface:

    trait WriteBytesExt<T> {
        fn write_le(&mut self, n: T) -> io::Result<()>;
        fn write_be(&mut self, n: T) -> io::Result<()>;
    }
    
    impl<W> WriteBytesExt<u8> for W where W: Write {
        fn write_le(&mut self, n: u8) -> io::Result<()> {
            ....
        }
        ....
    }
    

    First of all it gets rid of the enum. Since the enum is purely a compile time parameter it cannot be used for dynamic dispatch. This is as good or bad as having it directly the method name. Thus I do not see the point of having it. Secondly it gets rid of the redundant type name in the signature.

    This shortens the method call significantly

    w.write_u16::<LittleEndian>(42u16)
    

    becomes

    w.write_le(42u16)
    

    My two points are:

    1. The type in the method signature carries redundant information.
    2. The enum type parameter does not provide any benefit for the user.
    3. Enums are most useful for runtime polymorphism. Thus, as the enum variants are no types these *BytesExt traits cannot be use to write generic code that abstracts over endianness. Again no benefit for the user.
    opened by nwin 30
  • Add a version using new `io` module

    Add a version using new `io` module

    As I understand, the purpose of this crate is to prepare for a world where we no longer have the endian-writing/reading functions on Reader and Writer. As someone that that uses those functions a lot, I would like to prepare my crates (namely bincode) for the new io crate. I'd be willing to help do the port if you are interested.

    opened by TyOverby 25
  • approaching 1.0

    approaching 1.0

    byteorder is very heavily used, but its API has mostly remained the same since it was first released (which was inspired by both Go's encoding/binary package and the pre-existing methods in Rust's old standard library that fulfilled a similar role). There was however significant discussion on its API in #27, but I feel that no consensus has been reached and I don't think there's an obviously better API given Rust in its current form. Therefore, I'd like to propose that we cut a 1.0 release in the next few weeks.

    I think the only outstanding issue that we should try to resolve before 1.0 is #52.

    cc @nwin @TyOverby @sfackler @lambda

    help wanted 
    opened by BurntSushi 16
  • Consider mechanisms to convert &[u32] to &[u8]

    Consider mechanisms to convert &[u32] to &[u8]

    It makes me a little sad to see unsafe being used to convert a &[u32] into a &[u8] in octavo:

        fn crypt(&mut self, input: &[u8], output: &mut [u8]) {
            assert_eq!(input.len(), output.len());
    
            if self.index == STATE_BYTES { self.update() }
    
            let buffer = unsafe {
                slice::from_raw_parts(self.buffer.as_ptr() as *const u8, STATE_BYTES)
            };
    
            for i in self.index..input.len() {
                output[i] = input[i] ^ buffer[i];
            }
    
            self.index = input.len();
        }
    

    We really ought to have a place to centralize this functionality so that it's well tested and safe across our ecosystem. Would it make sense to have this functionality be in byteorder?

    It'd also be interesting to also support the inverse operation, where a &[u8] is converted into a (&[u8], &[u32], &[u8]), where the first and last slice are there to read a byte-at-a-time until the the slice is aligned. This style operation could be useful to safely massage a slice into something that can use simd (or at least simd-ish operations over a usize value).

    cc @huonw, @bluss, @hauleth

    opened by erickt 15
  • Speed up slice writes

    Speed up slice writes

    Hi there,

    I've been toying around with adding faster to a few encoding libraries, and I noticed that I could get up to a 6x speed boost by using it in write_u16_into, write_u32_into, and write_u64_into. The compiler does a pretty good job of vectorizing the read functions.

    Would there be any interest in adding this behind a feature?

    Benchmarks: (Ivy Bridge host; 128-bit integer vectors)

    faster (No difference between target-cpu=native and target-cpu=x86-64)
    test slice_u16::write_big_endian    ... bench:      23,344 ns/iter (+/- 122) = 8567 MB/s
    test slice_u32::write_big_endian    ... bench:      46,681 ns/iter (+/- 160) = 8568 MB/s
    test slice_u64::write_big_endian    ... bench:     105,206 ns/iter (+/- 369) = 7604 MB/s
    master (-C target-cpu=native)
    test slice_u16::write_big_endian    ... bench:     147,829 ns/iter (+/- 269) = 1352 MB/s
    test slice_u32::write_big_endian    ... bench:     112,241 ns/iter (+/- 652) = 3563 MB/s
    test slice_u64::write_big_endian    ... bench:     108,404 ns/iter (+/- 571) = 7379 MB/s
    
    opened by AdamNiederer 12
  • Change as_ptr to as_mut_ptr to fix Miri error

    Change as_ptr to as_mut_ptr to fix Miri error

    Before, the example in the docs for ByteOrder::from_slice_i32 caused Miri to error with the message, "error: Undefined Behavior: trying to reborrow for Unique at alloc1389, but parent tag does not have an appropriate item in the borrow stack". Now it runs without errors (tested locally by creating an example and running it with cargo +nightly miri run --example the_example).

    (This is the example in the Rust Playground. You can run it with Miri by selecting "Miri" from the "Tools" menu.)

    Fwiw, I'm not sure if the original code really has undefined behavior or not, but this PR is a simple change, and the new code is a little clearer anyway.

    opened by jturner314 11
  • Writing to uninitialized buffer

    Writing to uninitialized buffer

    write_* methods of ByteOrder trait accept a buffer and don't guarantee that they wouldn't read from it. This has a drawback that strictly speaking, the provided buffer shouldn't be uninitialized.

    I suggest to provide some way of guaranteeing that the buffer won't be read from, so it's fine to pass uninitialized buffer.

    wontfix 
    opened by Kixunil 10
  • Read Write for core

    Read Write for core

    @Tobba and I have worked on https://github.com/QuiltOS/core_io, a copy of Read and Write but with an associated error type to make it just need core. Perhaps it would be nice to (optionally) extend these traits for no_std users?

    opened by Ericson2314 8
  • Consider adding runtime-defined endianness

    Consider adding runtime-defined endianness

    Sometimes it is impossible to statically determine required endianness in advance. For example, TIFF image format defines endianness in the first byte of an input file, so it may be either big or little but which exactly is unknown statically. It would be nice if I could use byteorder for this task too.

    question 
    opened by netvl 8
  • Unlicense is flawed enough to scare me off

    Unlicense is flawed enough to scare me off

    I'd like to depend on this for an experiment where I'm rewriting a Python script which examines GIF files in a performance-optimized manner.

    However because of flaws in the Unlicense, it's on my blacklist to ensure proper safety for my users, regardless of the jurisdiction they're in.

    Is there any chance you'd be willing to offer byteorder under something more carefully designed like the Creative Commons CC0 public domain dedication?

    (CC0 is also what the FSF recommends if you want to release your code into the public domain.)

    opened by ssokolow 8
  • Convert endianess while copying in read/write into methods

    Convert endianess while copying in read/write into methods

    Rather than first copying data from source to destination buffer and then performing endianess adjustment, to the conversion while copying. This means that each byte is accessed only once which (according to benchmarks) speeds up read_xxx_into and write_xxx_into methods:

    | Benchmark                      | Before [ns/iter] | After [ns/iter] |
    |--------------------------------+------------------+-----------------|
    | slice_i64::read_big_endian     |  34,863  (±  30) | 23,656  (± 935) |
    | slice_i64::read_little_endian  |  15,518  (±  19) | 13,362  (± 405) |
    | slice_i64::write_big_endian    |  30,910  (± 109) | 23,123  (±  91) |
    | slice_i64::write_little_endian |  14,924  (±  21) | 13,209  (± 180) |
    |--------------------------------+------------------+-----------------|
    | slice_u16::read_big_endian     |   7,492  (± 343) |  3,788  (±  16) |
    | slice_u16::read_little_endian  |   3,366  (±   8) |  3,198  (±   3) |
    | slice_u16::write_big_endian    |   4,066  (±   7) |  4,497  (±   8) |
    | slice_u16::write_little_endian |   4,040  (± 946) |  3,193  (±   7) |
    |--------------------------------+------------------+-----------------|
    | slice_u64::read_big_endian     |  35,816  (± 251) | 23,259  (±  21) |
    | slice_u64::read_little_endian  |  15,506  (±  86) | 13,365  (±  81) |
    | slice_u64::write_big_endian    |  30,948  (±  63) | 23,102  (±  36) |
    | slice_u64::write_little_endian |  14,938  (±  17) | 13,158  (±  18) |
    

    The benchmarks were done on AMD Ryzen 9 5900X 12-Core Processor.

    I’m somewhat confused why little endian benchmark show improvements but the results are reproducible. My best guess is that it’s compiler failing to optimise out for v $dst.iter_mut() { nop(); } loops currently present.

    opened by mina86 0
  • Changelog is not up to date

    Changelog is not up to date

    Current version of byteorder is 1.4.3, but the changelog ends with 1.3.4: https://github.com/BurntSushi/byteorder/blob/abffade8232229db557e0a30c395963071624b2b/CHANGELOG.md

    It would be nice if someone can add the changes from the more recent versions. :)

    opened by striezel 0
  • Implement write_uXX_from

    Implement write_uXX_from

    This was discussed in #155, and makes my life much easier when serializing big vectors.

    For now I only implemented write_u32_from, but if this looks OK I can go implement all the others too. I also created slice_to_u8 based on slice_to_u8_mut, but I don't think the comment about "modification of the binary representation of any Copy type" applies to it.

    opened by luizirber 0
  • Add methods that take/return arrays

    Add methods that take/return arrays

    The recently-added standard library endian conversion functions like from_be_bytes and to_be_bytes operate on arrays by value rather than slices by reference, which can provide better type safety in some cases.

    It would be great if the ByteOrder trait were to add similar methods so that code which still needs to be generic over byte order can benefit from the array approach. Concretely, I would make use of those methods in the zerocopy::byteorder module if they were available.

    cc @tamird

    opened by joshlf 1
Owner
Andrew Gallant
I love to code.
Andrew Gallant
A HTTP Archive format (HAR) serialization & deserialization library, written in Rust.

har-rs HTTP Archive format (HAR) serialization & deserialization library, written in Rust. Install Add the following to your Cargo.toml file: [depende

Sebastian Mandrean 25 Dec 24, 2022
A HTML entity encoding library for Rust

A HTML entity encoding library for Rust Example usage All example assume a extern crate htmlescape; and use htmlescape::{relevant functions here}; is

Viktor Dahl 41 Nov 1, 2022
tnetstring serialization library for rust.

TNetStrings: Tagged Netstrings This module implements bindings for the tnetstring serialization format. API let t = tnetstring::str("hello world"); le

Erick Tryzelaar 16 Jul 14, 2019
A TOML encoding/decoding library for Rust

toml-rs A TOML decoder and encoder for Rust. This library is currently compliant with the v0.5.0 version of TOML. This library will also likely contin

Alex Crichton 1k Dec 30, 2022
A series of compact encoding schemes for building small and fast parsers and serializers

A series of compact encoding schemes for building small and fast parsers and serializers

Manfred Kröhnert 2 Feb 5, 2022
Encoding and decoding support for BSON in Rust

bson-rs Encoding and decoding support for BSON in Rust Index Overview of BSON Format Usage BSON Values BSON Documents Modeling BSON with strongly type

mongodb 304 Dec 30, 2022
pem-rs pem PEM jcreekmore/pem-rs [pem] — A Rust based way to parse and encode PEM-encoded data

pem A Rust library for parsing and encoding PEM-encoded data. Documentation Module documentation with examples Usage Add this to your Cargo.toml: [dep

Jonathan Creekmore 30 Dec 27, 2022
Variable-length signed and unsigned integer encoding that is byte-orderable for Rust

ordered-varint Provides variable-length signed and unsigned integer encoding that is byte-orderable. This crate provides the Variable trait which enco

Khonsu Labs 7 Dec 6, 2022
Free Rust-only Xbox ADPCM encoder and decoder

XbadPCM Safe (and optionally no-std) Rust crate for encoding and decoding Xbox ADPCM blocks. Decoding example Here is example code for decoding stereo

Snowy 5 Nov 20, 2022
Crate to parse and emit EDN

edn-rs Near Stable no breaking changes expected. Crate to parse and emit EDN This lib does not make effort to conform the EDN received to EDN Spec. Th

Julia Naomi 61 Dec 19, 2022
Fast and compact sets of bytes or ASCII characters

bset Fast and compact sets of bytes and ASCII characters, useful for searching, parsing and determining membership of a given byte in the given set. T

null 26 Jul 19, 2022
Decode SCALE bytes into custom types using a scale-info type registry and a custom Visitor impl.

scale-decode This crate attempts to simplify the process of decoding SCALE encoded bytes into a custom data structure given a type registry (from scal

Parity Technologies 6 Sep 20, 2022
rust-jsonnet - The Google Jsonnet( operation data template language) for rust

rust-jsonnet ==== Crate rust-jsonnet - The Google Jsonnet( operation data template language) for rust Google jsonnet documet: (http://google.github.io

Qihoo 360 24 Dec 1, 2022
MessagePack implementation for Rust / msgpack.org[Rust]

RMP - Rust MessagePack RMP is a pure Rust MessagePack implementation. This repository consists of three separate crates: the RMP core and two implemen

Evgeny Safronov 840 Dec 30, 2022
A Rust ASN.1 (DER) serializer.

rust-asn1 This is a Rust library for parsing and generating ASN.1 data (DER only). Installation Add asn1 to the [dependencies] section of your Cargo.t

Alex Gaynor 85 Dec 16, 2022
Implementation of Bencode encoding written in rust

Rust Bencode Implementation of Bencode encoding written in rust. Project Status Not in active developement due to lack of time and other priorities. I

Arjan Topolovec 32 Aug 6, 2022
Cap'n Proto for Rust

Cap'n Proto for Rust documentation blog Introduction Cap'n Proto is a type system for distributed systems. With Cap'n Proto, you describe your data an

Cap'n Proto 1.5k Dec 26, 2022
A Gecko-oriented implementation of the Encoding Standard in Rust

encoding_rs encoding_rs an implementation of the (non-JavaScript parts of) the Encoding Standard written in Rust and used in Gecko (starting with Fire

Henri Sivonen 284 Dec 13, 2022
Character encoding support for Rust

Encoding 0.3.0-dev Character encoding support for Rust. (also known as rust-encoding) It is based on WHATWG Encoding Standard, and also provides an ad

Kang Seonghoon 264 Dec 14, 2022