A fast monadic-style parser combinator designed to work on stable Rust.

Related tags

Parsing chomp
Overview

Chomp

Gitter Build Status Coverage Status Crates.io Documentation

Chomp is a fast monadic-style parser combinator library designed to work on stable Rust. It was written as the culmination of the experiments detailed in these blog posts:

For its current capabilities, you will find that Chomp performs consistently as well, if not better, than optimized C parsers, while being vastly more expressive. For an example that builds a performant HTTP parser out of smaller parsers, see http_parser.rs.

Installation

Add the following line to the dependencies section of your Cargo.toml:

[dependencies]
chomp = "0.3.1"

Usage

Parsers are functions from a slice over an input type Input<I> to a ParseResult<I, T, E>, which may be thought of as either a success resulting in type T, an error of type E, or a partially completed result which may still consume more input of type I.

The input type is almost never manually manipulated. Rather, one uses parsers from Chomp by invoking the parse! macro. This macro was designed intentionally to be as close as possible to Haskell's do-syntax or F#'s "computation expressions", which are used to sequence monadic computations. At a very high level, usage of this macro allows one to declaratively:

  • Sequence parsers, while short circuiting the rest of the parser if any step fails.
  • Bind previous successful results to be used later in the computation.
  • Return a composite datastructure using the previous results at the end of the computation.

In other words, just as a normal Rust function usually looks something like this:

fn f() -> (u8, u8, u8) {
    let a = read_digit();
    let b = read_digit();
    launch_missiles();
    return (a, b, a + b);
}

A Chomp parser with a similar structure looks like this:

fn f<I: U8Input>(i: I) -> SimpleResult<I, (u8, u8, u8)> {
    parse!{i;
        let a = digit();
        let b = digit();
                string(b"missiles");
        ret (a, b, a + b)
    }
}

And to implement read_digit we can utilize the map function to manipulate any success value while preserving any error or incomplete state:

// Standard rust, no error handling:
fn read_digit() -> u8 {
    let mut s = String::new();
    std::io::stdin().read_line(&mut s).unwrap();
    s.trim().parse().unwrap()
}

// Chomp, error handling built in, and we make sure we only get a number:
fn read_digit<I: U8Input>(i: I) -> SimpleResult<I, u8> {
    satisfy(i, |c| b'0' <= c && c <= b'9').map(|c| c - b'0')
}

For more documentation, see the rust-doc output.

Example

#[macro_use]
extern crate chomp;

use chomp::prelude::*;

#[derive(Debug, Eq, PartialEq)]
struct Name<B: Buffer> {
    first: B,
    last:  B,
}

fn name<I: U8Input>(i: I) -> SimpleResult<I, Name<I::Buffer>> {
    parse!{i;
        let first = take_while1(|c| c != b' ');
                    token(b' ');  // skipping this char
        let last  = take_while1(|c| c != b'\n');

        ret Name{
            first: first,
            last:  last,
        }
    }
}

assert_eq!(parse_only(name, "Martin Wernstål\n".as_bytes()), Ok(Name{
    first: &b"Martin"[..],
    last: "Wernstål".as_bytes()
}));

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Contact

File an issue here on Github or visit gitter.im/m4rw3r/chomp.

Comments
  • Make `Input` a trait

    Make `Input` a trait

    Problem

    Currently the input type only allows for slices, and is special cased for situations where it may not be the whole of the input. I cannot provide any line/row/offset counting either since it is a concrete type and an extension with that functionality would impact all code.

    This would provide a way to slot in position-aware wrappers to solve #38 neatly.

    Proposed solution

    Convert Input<I> into a trait, with ret and err as provided methods, the input-token type would be the associated type Token. All the primitive methods (currently provided by InputClone and InputBuffer) are also present but require an instance of the zero-sized type Guard which cannot be instantiated outside of the primitives module (note the private field). The primitives would be reachable through methods on a Primitives trait which has to be used separately (the blanket implementation for all Input makes it possible to easily use it once it is in scope).

    use primitives::Guard;
    pub use primitives::Primitives;
    
    pub trait Input: Sized {
        type Token;
        type Marker;
    
        fn ret<T>(self, t: T) -> ParseResult<Self, T> {
            ParseResult(self, t)
        }
    
        fn _consume(self, usize, Guard)        -> Self;
        fn _buffer(&self, Guard)               -> &[Self::Token];
        fn _is_end(&self, Guard)               -> bool;
        fn _mark(&self, Guard)                 -> Self::Marker;
        fn _restore(self, Self::Marker, Guard) -> Self;
    }
    
    pub mod primitives {
        use Input;
    
        pub struct Guard(());
    
        pub trait Primitives: Input {
            fn consume(self, n: usize) -> Self {
                self._consume(Guard(()), n)
            }
            fn buffer(&self) -> &[Self::Token] {
                self._buffer(Guard(()))
            }
            fn is_end(&self) -> bool {
                self._is_end(Guard(()))
            }
            fn mark(&self) -> Self::Marker {
                self._mark(Guard(()))
            }
            fn restore(self, m: Self::Marker) -> Self {
                self._restore(Guard(()), m)
            }
        }
    
        impl<I: Input> Primitives for I {}
    }
    

    The mark method is the replacement for InputClone, it should be used with the restore method to restore the state of the Input to the old one.

    Pros

    • Input can be implemented directly for slices, eliminating certain branches from parsers and combinators like many, take_while, eof and so on.
    • An Input implementation can be provided for line-counting which could be slotted in to provide line-counting in any existing parsers
    • The mark and restore methods would provide mechanisms allowing types which do not wholly consist of slices to work, though the buffer method is probably not the right choice for that, it will need a change to eg. support ropes.
    • All parsers need to be generic, before we could get away with only concrete types since Input<u8> is a concrete type. Input<Token=u8> will not be a concrete type.

    Cons

    • Parser function signature change, very backwards incompatible:

      // old
      fn my_parser<'a, I>(i: Input<'a, I>, ...) -> ParseResult<'a, I, T, E>
      // old, lifetime elision:
      fn my_parser<I>(i: Input<I>, ...) -> ParseResult<I, T, E>
      // new
      fn my_parser<I: Input>(i: I, ...) -> ParseResult<I, T, E>
      
    • The type I: Input can no longer be guaranteed to be linear since the #[must_use] annotation cannot be put on the concrete type.

      This is probably not an issue in practice since the I type is required by value to create a ParseResult and the ParseResult in turn is ultimately required by the functions which start the parsing.

    enhancement 
    opened by m4rw3r 13
  • Is there a way to get current position?

    Is there a way to get current position?

    Hi! I'm wondering if it would be possible to add a function that could provide the current position in the file (or stream)?

    In my case, I'm parsing from a file and would like to capture the line number in particular.

    I haven't had a chance to dig through the code much yet, but if I were to take a stab at adding it, I'd definitely appreciate a few pointers! I'm guessing it would have to bubble up from the buffer...

    enhancement 
    opened by readysetmark 9
  • How do I examine success/fail?

    How do I examine success/fail?

    let parse_result = parse!{i;
    ..
    };
    
    // I now have to execute some Rust code to see what parser I should call next.
    let input2: Input<'i, u8> = match parse_result.into_inner() {
        // stuck here.
        // Ok(o) => o,
        // Err(e) => return parse_result
    };
    ```rust
    
    I'm a bit lost walking through the types. I simply want to continue with an Input, or return the parse_result.
    Any help would be appreciated.
    Thanks!
    
    opened by MarkSwanson 7
  • Infinite loop?

    Infinite loop?

    skip_many() and many() do not seem to be propagating the incomplete state. Or maybe the or combinator is always resetting the stream position and not propagating the error?

    I expect the flow to be:

    • skip_many(all)
    • all OR tests b() and c() - both fail
    • all returns fail
    • skip_many returns fail <-- this does not happen ... infinite loop ...

    Ideas?

    Thanks!

    i == "fffff".as_bytes(); // will never match any token...
    parse!{i;
        skip_many(all);
        ...
    
    pub fn c<'a>(i: Input<'a, u8>) -> U8Result<()> {
        parse!{i;
            take_while(is_whitespace);
            ret () } }
    
    pub fn b<'i, 'a>(i: Input<'i, u8>, s: &'a str) -> U8Result<'i, ()> {
        parse!{i;
            take_while(is_whitespace);
            ret () } }
    
    pub fn all<'a>(i: Input<'a, u8>) -> U8Result<()> {
        let s = String::new();
        parse!{i;
                b(&s) <|>
                c();
            ret () } }
    ```rust
    
    opened by MarkSwanson 7
  • run_scanner state can't depend on last token

    run_scanner state can't depend on last token

    I'm trying to parse one utf8 character. I tried run_scanner and std::char::from_u32, but it doesn't work because when I get a whole character, the way to signal it is to return None, which throws away the state.

    opened by dckc 6
  • `or` with many branches

    `or` with many branches

    I need a parser with multiple branches. My first attempt was:

    or(i, parser! { string(b"ERROR"); ret LogLevel::Error }, |i| {
        or(i, parser! { string(b"WARN"); ret LogLevel::Warn }, |i| {
            or(i,
               parser! { string(b"INFO"); ret LogLevel::Info },
               parser! { string(b"DEBUG"); ret LogLevel::Debug })
        })
    })
    

    It works, but it feels too verbose, so I tried to write this macro:

    macro_rules! alt {
        ($i:expr, $a:expr) => { $a };
        ($i:expr, $a:expr, $b:expr) => { or($i, $a, $b) };
        ($i:expr, $a:expr, $($b:expr),*) => { or($i, $a, |i| alt!(i, $($b),*)) };
    }
    

    Now, the parser looks much better:

    alt!(i,
        { parser! { string(b"ERROR"); ret LogLevel::Error } },
        { parser! { string(b"WARN");  ret LogLevel::Warn } },
        { parser! { string(b"INFO");  ret LogLevel::Info } },
        { parser! { string(b"DEBUG"); ret LogLevel::Debug } }
    )
    

    I have some questions:

    • Is this a good solution? Is there any alternative?
    • Can you include something like this in the crate? I think that more people will need a multi-or combinator.

    Thanks!

    enhancement combinator 
    opened by ayosec 5
  • More comprehensive examples

    More comprehensive examples

    I've looked at a bunch of libraries for parsing in Rust and chomp's API feels most intuitive based on small examples. However, I've found myself struggling to make even a simple parser work in practice. It's pretty easy to mess up a macro and have the compiler complain, for example, and I'm still not sure how to parse something like "1.0" into an f64 efficiently and correctly.

    Having a more complete example would be super helpful to people approaching the library. Any format with a reasonably wide variety of data types (strings, ints, floats) would be great - maybe JSON?

    I'll submit a PR for something like this if/when I make enough progress.

    documentation 
    opened by josephsavona 4
  • Input trait

    Input trait

    See #44

    • [x] Investigate removal of State::Incomplete

      Simplifies all combinators which otherwise had to special-case the incomplete states. Does not seem to actually improve performance in any way despite decreasing the number of possible branches, guess that goes to show how good branch-prediction is in a modern CPU.

      Conclusion: removed State::Incomplete.

    • [x] Implement Input for &str

    • [x] ~~Implement Input for an Iterator wrapper~~

      Moved to separate PR #49

    • [x] Implement Input for Tendril

    • [x] Replace the inner State type with Result since it is just a success/fail state and makes for easier conversions and simpler code in general.

    • [x] Implement a numbering wrapper for any Input.

    • [x] Move input::InputBuf to buffer module and replace the mode property with a boolean flag.

    • [x] Update tests to accommodate for Incomplete removal and restructuring of the Input.

      It is probably suitable to investigate if Quickcheck can be used here (#33)

    • [x] Rework buffer module

      Make sure lifetimes are removed (now part of Input implementation), simplify.

    • [x] Update doctests

    • [x] Support for optimized skip_while in Input

    opened by m4rw3r 4
  • what type of argument to supply for i.err()?

    what type of argument to supply for i.err()?

    What do I give as the argument?

    I can't figure out what to use in place of 0:

    fn expr(i: Input<u8>) -> U8Result<ExprRef> {
        or(i,
           literal(i),
           // TODO: all the other sorts of Expr
           i.err(0))  // <- here
    }
    

    I can't understand the diagnostics:

    cargo test
       Compiling monte-rs v0.1.0 (file:///home/connolly/projects/monte-rs)
    src/mast.rs:24:5: 24:7 error: the trait `core::ops::FnOnce<(chomp::input::Input<'_, u8>,)>` is not implemented for the type `chomp::parse_result::ParseResult<'_, u8, Box<alloc::rc::Rc<kernel::Expr>>, chomp::parsers::Error<u8>>` [E0277]
    src/mast.rs:24     or(i,
                       ^~
    src/mast.rs:24:5: 24:7 help: run `rustc --explain E0277` to see a detailed explanation
    src/mast.rs:24:5: 24:7 note: required by `chomp::combinators::or`
    src/mast.rs:24:5: 24:7 error: the trait `core::ops::FnOnce<(chomp::input::Input<'_, u8>,)>` is not implemented for the type `chomp::parse_result::ParseResult<'_, u8, _, _>` [E0277]
    src/mast.rs:24     or(i,
                       ^~
    src/mast.rs:24:5: 24:7 help: run `rustc --explain E0277` to see a detailed explanation
    src/mast.rs:24:5: 24:7 note: required by `chomp::combinators::or`
    src/mast.rs:24:5: 24:7 error: the trait `core::ops::FnOnce<(chomp::input::Input<'_, u8>,)>` is not implemented for the type `chomp::parse_result::ParseResult<'_, u8, Box<alloc::rc::Rc<kernel::Expr>>, chomp::parsers::Error<u8>>` [E0277]
    src/mast.rs:24     or(i,
                       ^~
    src/mast.rs:24:5: 24:7 help: run `rustc --explain E0277` to see a detailed explanation
    src/mast.rs:24:5: 24:7 note: required by `chomp::combinators::or`
    src/mast.rs:24:5: 24:7 error: the trait `core::ops::FnOnce<(chomp::input::Input<'_, u8>,)>` is not implemented for the type `chomp::parse_result::ParseResult<'_, u8, _, _>` [E0277]
    src/mast.rs:24     or(i,
                       ^~
    src/mast.rs:24:5: 24:7 help: run `rustc --explain E0277` to see a detailed explanation
    src/mast.rs:24:5: 24:7 note: required by `chomp::combinators::or`
    error: aborting due to 2 previous errors
    
    opened by dckc 4
  • Size hint for internal iterator used by `many*` and `count`

    Size hint for internal iterator used by `many*` and `count`

    Implementing Iterator::size_hint will enable more efficient allocation for the containers. According to profiling a lot of time is spent allocating and reallocating in some situations, a better size_hint would improve performance.

    • count should yield (n, Some(n)) since it will always result in n elements on success.
    • many1 and sep_by1 should yield (1, None) since they have no upper bound
    • many, many_till and sep_by can use the default value of (0, None)

    It might also be feasible to implement a combinator which is a hybrid of count and many, where specifying both a lower and upper bound is possible. This would make it a lot more efficient to allocate some parts (and by using monadic composition it can even reserve space for a certain known number of elements which was specified earlier in the message).

    Spec for bounded many

    fn many(Input<I>, R, Parser) -> ParseResult<I, T, E>
      where R:      BoundedRange,
            Parser: FnMut(Input<I>) -> ParseResult<I, U, E>,
            T:      FromIterator<U>
    
    trait BoundedRange { ... }
    
    impl BoundedRange for Range { ... }
    impl BoundedRange for RangeFull { ... }
    impl BoundedRange for RangeFrom { ... }
    impl BoundedRange for RangeTo { ... }
    
    • Iteration should stop once the max value is reached (if it is specified by the range), no more than n items should be emitted unless the range is lacking an upper bound
    • A size_hint based on the range should be provided
    • If an error or incomplete is encountered outside of the range (ie. if fewer items than the lower bound have been emitted), the error should be propagated
    • If an error is encountered inside of the range the parser should be considered complete and return the resulting FromIterator value
    • If an incomplete is encountered inside of the range of the parser it should be considered complete if the input is END_OF_INPUT and input.len is 0.

    TODO

    • [x] bounded::many spec
    • [x] BoundedRange trait
    • [x] bounded::many impls
      • [x] Replace uses of iter::Iter
        • [x] count
        • [x] many
        • [x] many1
        • [x] sep_by
        • [x] sep_by1
    • [x] bounded::many_till
      • [x] Replace uses of iter::IterTill (many_till)
    • [x] bounded::skip_many
    opened by m4rw3r 4
  • Returning the entire slice matched by a chain of parsers

    Returning the entire slice matched by a chain of parsers

    Is there a clean way to use the parse! macro and return the entire slice that was matched? Currently, I do something like this:

    // An identifier is an alphanumeric string that doesn't start with a digit.
    fn identifier<I: U8Input>(i: I) -> SimpleResult<I, ()> {
        parse!{i;
            satisfy(is_alpha);
            take_while(is_alphanumeric);
    
            ret ()
        }
    }
    
    // An alias definition is two identifiers separated by an equals sign, e.g. "foo=bar".
    fn alias<I: U8Input>(i: I) -> SimpleResult<I, (I::Buffer, I::Buffer)> {
        parse!{i;
            let (left, _)  = matched_by(identifier);
                            token(b'=');
            let (right, _) = matched_by(identifier);
    
            ret (left, right)
        }
    }
    

    It would be nicer if alias didn't have to use matched_by and could just say let left = identifier(). Does chomp provide a good way of doing this?

    question 
    opened by paigeruten 3
  • Fix string parser

    Fix string parser

    this is just one part of the story (as described in #67), but it seems to me that one piece of the puzzle would improve the situation already. would you be willing to accept more pull requests/a bigger chunk of changes to other parser functions?

    opened by dario23 0
  • Question: What is the idiomatic way of parsing a string from a byte slice?

    Question: What is the idiomatic way of parsing a string from a byte slice?

    In other words, is there a concise way to map an arbitrary Result to a chomp error?

    In nom, I would do the following:

    map_res!(
    	take_until_and_consume!("\n"),
    	str::from_utf8
    )
    

    How would I do that with chomp?

    opened by kjvalencik 0
  • string parser (and possibly others internally using consume_while) force unnecessary stream reads

    string parser (and possibly others internally using consume_while) force unnecessary stream reads

    problem

    the chomp::parsers::string parser (and possibly others internally using consume_while) might force unnecessary stream reads. example code:

    #[macro_use]
    extern crate chomp;
    
    use chomp::prelude::*;
    use chomp::buffer::{Source, Stream};
    
    use std::net::TcpStream;
    
    
    fn main() {
        let tcp = TcpStream::connect("faumail.fau.de:143").unwrap();
        let mut src = Source::new(tcp);
    
        // IMAP lines end in b"\r\n", so the real text is everything up to b'\r',
        // but we have to read the line ending nonetheless before reading any future stuff
        let p = src.parse(parser!{take_till(|c| c == b'\r') <* string(b"\r\n")});
        println!("{:?}", p);
    }
    

    expected output: Ok(<some bytes from the imap server welcome line>)

    actual output: Err(Retry)

    cause

    the string parser (src/parsers.rs:378) uses consume_while(f), which first reads the next token from the input stream, and only after that inspects it (using f) for whether to consume it or not. note this is not a bug in consume_while, but its perfectly fine expected behaviour. the problem with using it the way it currently is for string(s) is that after len(s) tokens have been consumed, we could return successfully, but consume_while waits for the next token to call its decider function on (which then determines that it has read len(s) tokens already and tells consume_while to quit), which in some cases can force a read on the underlying stream when actually the answer would be clear.

    solution

    i wrote a (very hackish) fix for the string parser at https://github.com/dario23/chomp/tree/fix_string but (without having checked in depth) i'm expecting more parsers to be affected. probably a more exhaustive fix would include adding consume_while_max_n(f, usize).

    i'd be happy to propose changes and submit a PR, but only after hearing your opinion on the matter :-)

    opened by dario23 1
  • Improve parse! macro documentation

    Improve parse! macro documentation

    Currently the parse! macro documentation does not detail exactly what operators like <* expand to in terms of normal code. Having access to this is useful to debug certain issues which can arise in macro usage.

    opened by m4rw3r 0
  • Accessing numbering::InputPosition::position via map_err

    Accessing numbering::InputPosition::position via map_err

    I have a usecase where I'd like to somehow pass numbering::InputPosition::position to an Error type as a way of reporting parsing errors at a location (e.g. line/column location).

    The issue is that I'm unable to access numbering::InputPosition::position from within chomp::types::ParseResult::map_err function.

    I adapted map_err into map_err2 as follows: https://github.com/dashed/chomp/commit/3f1998b1d06394ed5b1a8c765230371b6a4f4533

    This enables me to do this:

    type ESParseResult<I, T> = ParseResult<I, T, ParseError>;
    
    fn some_parser<I: U8Input>(i: InputPosition<I, CurrentPosition>)
        -> ESParseResult<InputPosition<I, CurrentPosition>, ()> {
        parse!{i;
    
            let _var = (i -> {
                string(i, b"var")
                    .map_err2(|_, i| {
                        let loc = i.position();
                        ParseError::Expected(loc, "Expected var here.")
                    })
            });
    
            // ...
    
            ret {()}
        }
    }
    

    I'd love to hear any feedback on this, especially for any better alternative approaches. 👍


    Appendix

    CurrentPosition type for reference:

    #[derive(Debug, Copy, Clone, PartialEq, Eq, Ord, PartialOrd, Hash)]
    pub struct CurrentPosition(
        // The current line, zero-indexed.
        u64,
        // The current col, zero-indexed.
        u64
    );
    
    impl CurrentPosition {
        // Creates a new (line, col) counter with zero.
        pub fn new() -> Self {
            CurrentPosition(0, 0)
        }
    }
    
    impl Numbering for CurrentPosition {
        type Token  = u8;
    
        fn update<'a, B>(&mut self, b: &'a B)
            where B: Buffer<Token=Self::Token> {
                b.iterate(|c| if c == b'\n' {
                    self.0 += 1; // line num
                    self.1 = 0;  // col num
                } else {
                    self.1 += 1; // col num
                });
        }
    
        fn add(&mut self, t: Self::Token) {
            if t == b'\n' {
                self.0 += 1; // line num
                self.1 = 0;  // col num
            } else {
                self.1 += 1; // col num
            }
        }
    }
    
    pub trait Input: Sized {
    
        // ...
    
        #[inline]
        pub fn map_err2<V, F>(self, f: F) -> ParseResult<I, T, V>
          where F: FnOnce(E, &I) -> V {
            match self {
                ParseResult(i, Ok(t))  => ParseResult(i, Ok(t)),
                ParseResult(i, Err(e)) => {
                    let err = f(e, &i);
                    ParseResult(i, Err(err))
                },
            }
        }
    
        // ...
    }
    
    enhancement combinator 
    opened by dashed 2
Releases(0.3.1)
  • 0.3.1(Sep 6, 2016)

    Added

    • combinators::either: Like or but allows different result types from the parsers.

    Changes

    • Chomp is now licensed under both MIT and Apache-2.0 licenses.

    Bugfixes

    • Feature tendril now compiles again.
    Source code(tar.gz)
    Source code(zip)
  • 0.3.0(Sep 1, 2016)

    Added

    • prelude module containing basic types, parsers and combinators.

      This is supposed to be the equivalent of Attoparsec's main package.

    • run_parser which executes a parser on any given Input type.

    • buffer::InputBuf which contains a slice and an incomplete flag, much as the old Input struct.

    • Input<Token=T, Buffer=&[T]> implementation for &[T] where T: Copy + PartialEq.

    • Input<Token=char, Buffer=&str> implementation for &str.

    • types::Buffer trait which is implemented for all buffers providing common logic to perform the final parsing on a buffer without knowing the exact buffer implementation.

    • types::U8Input trait alias for Input<Token=u8>.

    • primitives::Primitives trait providing access to the primitive methods of the Input trait.

      This is used for building fundamental parsers/combinators.

    • ParseResult::inspect allowing code to observe the success value.

    • types::numbering module for creating position-aware Input types.

    • parsers::skip_while using an efficient way of skipping data if provided, otherwise falls back on using take_while and throws the result away.

    • chomp::Error now includes a backtrace in test and debug build profiles thanks to the debugtrace crate. Backtraces can also be activated permanently using the backtrace feature but this will incur the significant cost of allocating backtraces even in the release profile.

    • Feature noop_error provides a zero-sized error type for the cases when the expected token is unimportant. Provides a small performance boost.

    Changes

    • Backwards-incompatible: Input is now a trait with associated types Token and Buffer.

      This removes all incomplete logic from the parsers themselves and moves it into the InputBuf type. This InputBuf is used if a partial buffer is in memory. It also allows the parsers to operate directly on slices or use more effective means of storing buffers depending on the Input implementation.

      To upgrade you replace the previous concrete Input type with a generic, use its associated type if required, and refer to the Buffer associated type to allow for zero-copy parsing::

      -fn http_version(i: Input<u8>) -> U8Result<&[u8]>;
      +fn http_version<I: Input<Token=u8>>(i: I) -> SimpleResult<I, I::Buffer>;
      

      The associated types can be restricted if requried:

      fn request<I: U8Input>(i: I) -> SimpleResult<I, (Request<I::Buffer>, Vec<Header<I::Buffer>>)>
        where I::Buffer: ::std::ops::Deref<Target=[u8]>;
      
    • Backwards-incompatible: Moved types into a more logical module structure, prelude now exists as a prelude module.

    • Backwards-incompatible: chomp::Error is no longer an enum, this is to facillitate the support of backtraces while keeping code compatible between the different build profiles.

      Use chomp::Error::expected_token to determine if a specific token was expected.

    • Feature verbose_error is now default

    Removed

    • Input::incomplete
    • Input::new
    • ParseResult::expect
    • ParseResult::unwrap_err
    • ParseResult::unwrap
    • buffer::IntoStream
    • primitives::InputClone
    • primitives::State

    Bugfixes

    • combinators::bounded now have a defined behavior when a Range<usize> has start == end: They will parse exactly start times. This also fixed a few overflows and unreachable code being reachable.
    • combinators::bounded::many_till got fixed for an overflow happening when 0: usize was used to limit the number of iterations.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.6(Jul 7, 2016)

    Bugfixes

    • Macro expansion is now again compatible with nightly.
    • Now uses std feature of conv dependency, fixing build breakage on rustc 1.2.0.
    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Nov 24, 2015)

Owner
Martin Wernstål
Martin Wernstål
Rust parser combinator framework

nom, eating data byte by byte nom is a parser combinators library written in Rust. Its goal is to provide tools to build safe parsers without compromi

Geoffroy Couprie 7.6k Jan 7, 2023
A parser combinator library for Rust

combine An implementation of parser combinators for Rust, inspired by the Haskell library Parsec. As in Parsec the parsers are LL(1) by default but th

Markus Westerlind 1.1k Dec 28, 2022
A friendly parser combinator crate

Chumsky A friendly parser combinator crate that makes writing LL-1 parsers with error recovery easy. Example Here follows a Brainfuck parser. See exam

Joshua Barretto 2.4k Jan 8, 2023
A parser combinator for parsing &[Token].

PickTok A parser combinator like nom but specialized in parsing &[Token]. It has similar combinators as nom, but also provides convenient parser gener

Mikuto Matsuo 6 Feb 24, 2023
A WIP svelte parser written in rust. Designed with error recovery and reporting in mind

Svelte(rs) A WIP parser for svelte files that is designed with error recovery and reporting in mind. This is mostly a toy project for now, with some v

James Birtles 3 Apr 19, 2023
Website for Microformats Rust parser (using 'microformats-parser'/'mf2')

Website for Microformats Rust parser (using 'microformats-parser'/'mf2')

Microformats 5 Jul 19, 2022
Rust fast `&str` to `i64` parser (x86_64 SIMD, SSE4.1)

Rust fast &str to i64 parser (x86_64 SIMD, SSE4.1) Modified this version to support various string length and negative values. You need to define the

RoDmitry 4 Nov 1, 2022
A fast, extensible, command-line arguments parser

parkour A fast, extensible, command-line arguments parser. Introduction ?? The most popular argument parser, clap, allows you list all the possible ar

Ludwig Stecher 18 Apr 19, 2021
A native Rust port of Google's robots.txt parser and matcher C++ library.

robotstxt A native Rust port of Google's robots.txt parser and matcher C++ library. Native Rust port, no third-part crate dependency Zero unsafe code

Folyd 72 Dec 11, 2022
Parsing Expression Grammar (PEG) parser generator for Rust

Parsing Expression Grammars in Rust Documentation | Release Notes rust-peg is a simple yet flexible parser generator that makes it easy to write robus

Kevin Mehall 1.2k Dec 30, 2022
LR(1) parser generator for Rust

LALRPOP LALRPOP is a Rust parser generator framework with usability as its primary goal. You should be able to write compact, DRY, readable grammars.

null 2.4k Jan 7, 2023
A typed parser generator embedded in Rust code for Parsing Expression Grammars

Oak Compiled on the nightly channel of Rust. Use rustup for managing compiler channels. You can download and set up the exact same version of the comp

Pierre Talbot 138 Nov 25, 2022
Rust query string parser with nesting support

What is Queryst? This is a fork of the original, with serde and serde_json updated to 0.9 A query string parsing library for Rust inspired by https://

Stanislav Panferov 67 Nov 16, 2022
Soon to be AsciiDoc parser implemented in rust!

pagliascii "But ASCII Doc, I am Pagliascii" Soon to be AsciiDoc parser implemented in rust! This project is the current implementation of the requeste

Lukas Wirth 49 Dec 11, 2022
PEG parser for YAML written in Rust 🦀

yaml-peg PEG parser (pest) for YAML written in Rust ?? Quick Start ⚡️ # Run cargo run -- --file example_files/test.yaml # Output { "xmas": "true",

Visarut Phusua 4 Sep 17, 2022
This project aims to implement a CSS(less like) parser in rust. Currently the code is targeting the PostCSS AST.

CSS(less like) parser written in rust (WIP) This project aims to implement a CSS(less like) parser in rust. Currently the code is targeting the PostCS

Huang Liuhaoran 21 Aug 23, 2022
MRT/BGP data parser written in Rust.

BGPKIT Parser BGPKIT Parser aims to provides the most ergonomic MRT/BGP message parsing Rust API. BGPKIT Parser has the following features: performant

BGPKIT 46 Dec 19, 2022
This project aims to implement a CSS(less like) parser in rust. Currently the code is targeting the PostCSS AST. Very early stage, do not use in production.

CSS(less like) parser written in rust (WIP) This project aims to implement a CSS(less like) parser in rust. Currently the code is targeting the PostCS

Huang Liuhaoran 21 Aug 23, 2022
A feature-few, no-allocation JSON parser in `no_std` rust.

Small JSON Parser in no_std This library reads and parses JSON strings. Its intended use case is to read a JSON payload once. It does not serialise da

Robert Spencer 18 Nov 29, 2022