Rust parser combinator framework

Overview

nom, eating data byte by byte

LICENSE Join the chat at https://gitter.im/Geal/nom Build Status Coverage Status Crates.io Version Minimum rustc version

nom is a parser combinators library written in Rust. Its goal is to provide tools to build safe parsers without compromising the speed or memory consumption. To that end, it uses extensively Rust's strong typing and memory safety to produce fast and correct parsers, and provides functions, macros and traits to abstract most of the error prone plumbing.

nom logo in CC0 license, by Ange Albertini

nom will happily take a byte out of your files :)

Example

Hexadecimal color parser:

extern crate nom;
use nom::{
  IResult,
  bytes::complete::{tag, take_while_m_n},
  combinator::map_res,
  sequence::tuple
};

#[derive(Debug,PartialEq)]
pub struct Color {
  pub red:   u8,
  pub green: u8,
  pub blue:  u8,
}

fn from_hex(input: &str) -> Result<u8, std::num::ParseIntError> {
  u8::from_str_radix(input, 16)
}

fn is_hex_digit(c: char) -> bool {
  c.is_digit(16)
}

fn hex_primary(input: &str) -> IResult<&str, u8> {
  map_res(
    take_while_m_n(2, 2, is_hex_digit),
    from_hex
  )(input)
}

fn hex_color(input: &str) -> IResult<&str, Color> {
  let (input, _) = tag("#")(input)?;
  let (input, (red, green, blue)) = tuple((hex_primary, hex_primary, hex_primary))(input)?;

  Ok((input, Color { red, green, blue }))
}

fn main() {}

#[test]
fn parse_color() {
  assert_eq!(hex_color("#2F14DF"), Ok(("", Color {
    red: 47,
    green: 20,
    blue: 223,
  })));
}

Documentation

If you need any help developing your parsers, please ping geal on IRC (freenode, geeknode, oftc), go to #nom-parsers on Freenode IRC, or on the Gitter chat room.

Why use nom

If you want to write:

Binary format parsers

nom was designed to properly parse binary formats from the beginning. Compared to the usual handwritten C parsers, nom parsers are just as fast, free from buffer overflow vulnerabilities, and handle common patterns for you:

  • TLV
  • Bit level parsing
  • Hexadecimal viewer in the debugging macros for easy data analysis
  • Streaming parsers for network formats and huge files

Example projects:

Text format parsers

While nom was made for binary format at first, it soon grew to work just as well with text formats. From line based formats like CSV, to more complex, nested formats such as JSON, nom can manage it, and provides you with useful tools:

  • Fast case insensitive comparison
  • Recognizers for escaped strings
  • Regular expressions can be embedded in nom parsers to represent complex character patterns succinctly
  • Special care has been given to managing non ASCII characters properly

Example projects:

Programming language parsers

While programming language parsers are usually written manually for more flexibility and performance, nom can be (and has been successfully) used as a prototyping parser for a language.

nom will get you started quickly with powerful custom error types, that you can leverage with nom_locate to pinpoint the exact line and column of the error. No need for separate tokenizing, lexing and parsing phases: nom can automatically handle whitespace parsing, and construct an AST in place.

Example projects:

Streaming formats

While a lot of formats (and the code handling them) assume that they can fit the complete data in memory, there are formats for which we only get a part of the data at once, like network formats, or huge files. nom has been designed for a correct behaviour with partial data: If there is not enough data to decide, nom will tell you it needs more instead of silently returning a wrong result. Whether your data comes entirely or in chunks, the result should be the same.

It allows you to build powerful, deterministic state machines for your protocols.

Example projects:

Parser combinators

Parser combinators are an approach to parsers that is very different from software like lex and yacc. Instead of writing the grammar in a separate file and generating the corresponding code, you use very small functions with very specific purpose, like "take 5 bytes", or "recognize the word 'HTTP'", and assemble then in meaningful patterns like "recognize 'HTTP', then a space, then a version". The resulting code is small, and looks like the grammar you would have written with other parser approaches.

This has a few advantages:

  • The parsers are small and easy to write
  • The parsers components are easy to reuse (if they're general enough, please add them to nom!)
  • The parsers components are easy to test separately (unit tests and property-based tests)
  • The parser combination code looks close to the grammar you would have written
  • You can build partial parsers, specific to the data you need at the moment, and ignore the rest

Technical features

nom parsers are for:

  • byte-oriented: The basic type is &[u8] and parsers will work as much as possible on byte array slices (but are not limited to them)
  • bit-oriented: nom can address a byte slice as a bit stream
  • string-oriented: The same kind of combinators can apply on UTF-8 strings as well
  • zero-copy: If a parser returns a subset of its input data, it will return a slice of that input, without copying
  • streaming: nom can work on partial data and detect when it needs more data to produce a correct result
  • descriptive errors: The parsers can aggregate a list of error codes with pointers to the incriminated input slice. Those error lists can be pattern matched to provide useful messages.
  • custom error types: You can provide a specific type to improve errors returned by parsers
  • safe parsing: nom leverages Rust's safe memory handling and powerful types, and parsers are routinely fuzzed and tested with real world data. So far, the only flaws found by fuzzing were in code written outside of nom
  • speed: Benchmarks have shown that nom parsers often outperform many parser combinators library like Parsec and attoparsec, some regular expression engines and even handwritten C parsers

Some benchmarks are available on Github.

Rust version requirements

The 6.0 series of nom requires Rustc version 1.44 or greater (compatible with 1.37 if building without the alloc or std features, ie --no-default-features --features="regex,lexical").

Travis CI always has a build with a pinned version of Rustc matching the oldest supported Rust release. The current policy is that this will only be updated in the next major nom release.

Installation

nom is available on crates.io and can be included in your Cargo enabled project like this:

[dependencies]
nom = "6"

Then include it in your code like this:

#[macro_use]
extern crate nom;

NOTE: If you have existing code using nom below the 5.0 version, please take a look at the upgrade documentation to handle the breaking changes.

There are a few compilation features:

  • std: (activated by default) if disabled, nom can work in no_std builds
  • regexp: Enables regular expression parsers with the regex crate

You can activate those features like this:

[dependencies.nom]
version = "6"
features = ["regexp"]

Related projects

Parsers written with nom

Here is a (non exhaustive) list of known projects using nom:

Want to create a new parser using nom? A list of not yet implemented formats is available here.

Want to add your parser here? Create a pull request for it!

Contributors

nom is the fruit of the work of many contributors over the years, many thanks for your help!

Comments
  • Consolidate parser variants using ranges (e.g. `many0`, `many_m_n`)

    Consolidate parser variants using ranges (e.g. `many0`, `many_m_n`)

    Prerequisites

    Here are a few things you should provide to help me understand the issue:

    • Rust version : 1.44
    • nom version : nom7
    • nom compilation features used: basic

    Idea

    Nom has many1, many0, and many_m_n functions. Similar with other parts of the API.

    What if instead we had a IntoRange trait that took in the different range types and single numbers.

    Example:

    many(tag("abc"), 0..) -> many0(tag("abc")) 
    many(tag("abc"), 1..) -> many1(tag("abc")) 
    many(tag("abc"), 1) -> many_m_n(tag("abc"), 1, 1) 
    
    opened by epage 39
  • [WIP] Feature multi range

    [WIP] Feature multi range

    This PR intends to fix #1393.

    Description

    It replaces existing parser variations that only differ by how many times a subparser may run (like the different many_* variants) with a single parser that takes a range.

    In addition to a range parsers can also take a single usize value as parameter which evaluates to a range containing that value only (value..=value). This is primarily done for convenience.

    Migration remains fairly simple as the new parsers are a full replacement and just require the respective parameters to be rephrased as a single range.

    Reasoning

    Using ranges makes the API cleaner by removing unnecessary function duplication. It also allows nom to integrate with more of Rusts language features (specifically the range syntax).

    Example

    many(2..=4,
      tag("Abcd")
    )(i)
    

    With the old parsers this would be expressed using the many_m_n parser:

    many_m_n(2, 4,
      tag("Abcd")
    )(i)
    

    Special attention should be drawn to the fact that the other variations, such as many0 are also possible by using open ended ranges (0.., 1..).

    Notes

    The current implementation allows individual parsers to take ranges as parameters but requires them to document how they interpret open ended ranges. This was done to maximize flexibility and allow multiple, potentially contradicting, interpretations to coexist within the codebase.

    A good example of this in action is the difference between many and fold:

    • many (a.. -> a..=usize::MAX) many packages its results into a Vec. Therefore going beyond the usize limit makes no sense and any open ended range is capped at usize::MAX.
    • fold (a.. -> a..=∞) fold has static memory requirements since there is only one accumulator. This accumulator can be used indefinitely and as such the amount of iterations can go beyond even the usize::MAX limit (the current implementations of fold_* also support such cases). Because of this the parser interprets open ended ranges to mean "can run for an infinite number of times".

    TODOs

    Candidates

    The following parsers have been proposed for merging.

    • [x] many_* (many0, many1, many_m_n) Resolution: Done
    • [x] fold_* (fold_many0, fold_many1, fold_many_m_n) Resolution: Done
    • [ ] take_till* (take_till, take_till1)
    • [ ] take_while* (take_while, take_while1, take_while_m_n)
    • [ ] take_until* (take_until, take_until1)
    • [ ] many_till

    Open questions

    • [x] ~~How should the obsolete parsers be handled? Currently the obsolete parsers (like many_m_n) are deprecated using #[deprecated=""] with a message pointing the developer to the replacement. (Usually in the form of a simple Replaced by <new_parser>). This allows the changes to remain backwards compatible while steering developers towards using the replacement parsers. Tests for the deprecated parsers are left as is but are annotated with #[allow(deprecated)] to suppress the warnings. I would recommend keeping them in for as long as the parsers still exist to make sure that they arent broken accidentally.~~ Resolution: No deprecations for this release. The old parsers and the new range based parsers will exist side-by-side.
    • [ ] Resolve open questions of the individual candidates. See here
    opened by cenodis 38
  • Use `memchr` in `FindToken` and `FindSubstring` implementation

    Use `memchr` in `FindToken` and `FindSubstring` implementation

    As said on Gitter, this is the first step for a memchr usage in the FindToken implementation. So far, no performance boost has been observed. Maybe benchmarks are not designed to show this kind of performance change yet.

    This is a WIP.

    opened by Hywan 26
  • IResult VS Result: making Incomplete part of errors?

    IResult VS Result: making Incomplete part of errors?

    There have been a lot of demands that I change nom's basic type from IResult to std::result::Result. IResult has the following definition:

    pub enum IResult<I,O,E=u32> {
      Done(I,O),
      Error(Err<I,E>),
      Incomplete(Needed)
    }
    

    This was originally inspired from attoparsec's IResult in which the Partial branch contained a closure to be called when more data is available. For various reasons, I was not able to make the closure idea work (note that Rust was very far from 1.0 at the time), so I chose to show how much data was needed to ask the user to parse again.

    I open this issue to study what a change from IResult to Result would entail. I make no promise to do that change, and I will not put the issue to a vote. I will however take into account the responses I get.

    The proposal is to make the Incomplete branch part of the Error branch, which would allow employing Result. I do not know yet what the end type would look like. I see two possibilities for the Err type: containing either a Needed or the other error branches, or flattening Needed at the same level.

    So, to detail the arguments:

    pro:

    • can reuse all of the Result methods and the code relying on it
    • less surprising for developers
    • nom users can easily ignore all of the Incomplete usage, parsers are easier to write
    • it might make the macros simpler, because in pattern matching, most combinators would handle 2 cases instead of 4: Done and Error. In some combinators, there will be only three of them: Done, Error or Incomplete(Needed::Unknown), Incomplete::Needed::Size(sz), because the calculation of needed data must still happen
    • a lot of nom parsers are stuck on an old version and likely will never update (because it does the job as is), so less code to rewrite, maybe?
    • a lot of people find Incomplete confusing, since they work on complete data (like a file completely read in memory)
    • I could add in there a "non backtrackable error" that can contain the same thing as a normal error, but would make everything return instead of testing other branches
    • it might make compilation faster (compilation time is the reason for the nom fork in syn)

    con:

    • it means a nearly complete rewrite of nom. It's not necessarily an issue, I'm willing to make the time for it if needed
    • there are still a lot of nom users relying on Incomplete, and this is a big breaking change for them
    • even parsers that do not use Incomplete would need to be updated to use Result
    • it might make some combinators confusing. All of the backtracking combinators like alt! currently return on Incomplete instead of testing the next branch. So, instead of alt! and alt_complete!, do a alt! and alt_incomplete! ?
    • nom was designed with streaming in mind, I am worried hiding Incomplete will hide this benefit of the library

    also, I am not sure about the timeline here. I am doing nom 2.0 very soon and it introduces some breaking changes, but I'm worried this change might be too big and drive users away. On the other hand, a 3.0 would likely happen far in the future, and there would be even more code (in nom and in code relying on nom) to update.

    so, what do people think?

    needs testing 
    opened by Geal 26
  • Enhancement to Docs: A

    Enhancement to Docs: A "recipes" section

    Hi! Long time listener, first time caller. Love nom.

    Would you be interested in a PR for a "recipes" section of the docs that contains relatively short snippets to do common tasks? I'm thinking of stuff that's less than a complete example but more than a single combinator:

    • [X] EOL comments
    • [X] C-style comments, unnested and nested
    • [X] whitespace-eating combinators
    • [ ] ~Python-style~ ~indentation~ Already exists in the Python parser implementation.
    • [X] identifiers
    • [x] floats (an arbitrary but named syntax that a user could modify to their language)
    • [x] common integer forms (0x3f5a, 34u, 0b11010011, etc.)

    Also, are there more detailed docs or examples for verbose errors with row/col info? If not, and if you are agreeable to a PR along those lines, maybe you could point me to a nom parser that uses it, and I can extract bits to make a minimal but thorough tutorial.

    opened by rljacobson 18
  • nom 5.0 internal design

    nom 5.0 internal design

    After some thought, I reached a satisfying new design for nom 5.0, that I tried in the nomfun repository. This design uses functions instead of macros, with the same signature as macros combinators, mostly having I -> IResult<I,O,E> functions as arguments, and returning other functions, or applying them directly on some input. As an example, here is how the pair combinator would be written:

    pub fn pair<I, O1, O2, E, F, G>(first: F, second: G) -> IResult<I, (O1, O2), E>
      where F: Fn(I) -> IResult<I, O1, E>,
            G: Fn(I) -> IResult<I, O2, E> {
    
      move |input: I| {
        let (input, o1) = first(input)?;
        second(input).map(|(i, o2)| (i, (o1, o2)))
      }
    }
    
    pub fn pairc<I, O1, O2, E, F, G>(input: I, first: F, second: G) -> IResult<I, (O1, O2), E>
      where F: Fn(I) -> IResult<I, O1, E>,
            G: Fn(I) -> IResult<I, O2, E> {
    
      pair(first, second)(input)
    }
    

    This way we have two versions, one that combines two parsers and makes another one, and another that can take some input.

    The macro version can then be rewritten that way:

    macro_rules! pair(
      ($i:expr, $submac:ident!( $($args:tt)* ), $submac2:ident!( $($args2:tt)* )) => (
        pair!($i, |i| $submac!(i, $($args)*), |i| $submac2!(i, $($args2)*))
      );
    
      ($i:expr, $submac:ident!( $($args:tt)* ), $g:expr) => (
        pair!($i, |i| $submac!(i, $($args)*), $g);
      );
    
      ($i:expr, $f:expr, $submac:ident!( $($args:tt)* )) => (
        pair!($i, $f, |i| $submac!(i, $($args)*));
      );
    
      ($i:expr, $f:expr, $g:expr) => (
        $crate::pairc($i, $f, $g)
      );
    );
    

    As we can see currently in the 5.0 branch, most combinators are easy to replace:

    • extract the macro's code
    • put it in a function
    • add the proper trait bounds to the function
    • replace the macro's code with a function call

    The resulting code is functionally equivalent, has less type inference issues and is much faster when built with link time optimization (I have seen results like 20% faster with the new design on some benchmarks).

    Another benefit of this system is that it benefits from better import behaviour. Right now, even in edition 2018, macros that are exported are put at the top level (so the module import like macros use is actually a lie). So I cannot make variants of macros 'except by changing their name, to do stuff like separated_list and separated_list_complete. This was an issue because we expect slightly different behaviour in streaming parsers (where we're not sure we'll get the whole data at once) or in complete parsers (where we're sure we have the whole input). In nom 4, I tried to solve this by introducing the CompleteByteSlice and CompleteStr input types, that would behave differently, so you could use the same macros but have different parsers depending on the input. This proved difficult to use, especially considering that we might want to switch back and forth between behaviours (streaming parsers using TLV will know that they have the complete data inside the TLV, complete parsers might want to use methods that work directly on &[u8] and &str. Also, most people did not bother reading the documentation about it and started directly using &[u8] and &str as input when they expected the other behaviour, which resulted in quite some time spent explaining it.

    So with functions, we can actually make specialized versions of combinators. We could imagine having streaming and complete versions of many0, tag, etc. And we would let people use those versions by importing them directly (use nom::multi::streaming::many0, etc), and they could even use both versions in the same file.

    The downside is that there's an enormous amount of work for this:

    • for most combinators (that are not affected by streaming or not), two functions to add:
      • a function returning a closure
      • a function that is called directly
    • for combinators affected by streaming, multiply that by 3:
      • a legacy function to keep backward compatibility inside the macro (although I'm not yet sure we should keep the old behaviour, I might just remove CompleteByteSlice and CompleteStr)
      • a function that works in streaming
      • a function that does not work in streaming

    These functions will also require their own documentation and tests, and all of nom's documentation and examples should probably be adapted to this. I'm making steady progress on converting the combinators, but there's still a lot to do. (TODO: make a checklist of which combinators were ported over or not)

    Questions I have to solve now:

    • do I keep nom 4's behaviour with CompleteByteSlice and CompleteStr ?
    • do I keep macros related to streaming backward compatible with nom 4?
    • if not, which behaviour should I go with? Streaming or not?
    • do I make two versions of each combinators, or can I assume people will be able to write pair(first, second)(input) without any issues?
    • how do I port combinators that have a variable number of arguments, like do_parse (this one could probably be written directly with the ? syntax like this: https://github.com/Geal/nomfun/blob/master/benches/http.rs#L93-L102 ), tuple, permutation, alt, switch?
    • how do I reimplement ws?
    • should I extract method and ws in their own crates? They're not strictly necessary to nom and would make sense as separate libraries
    opened by Geal 17
  • Building nom for no_std fails on dependency memchr

    Building nom for no_std fails on dependency memchr

    Hello there, I'm trying to build nom for a RISC-V target which will use no_std. But after adding it to my Cargo.toml rust complained that a dependency required std.

    However looking at the previous issues, I found https://github.com/Geal/nom/issues/1370 and saw that there was a fix so I switched my dependency to point directly at github:

    [dependencies.nom]
    # version = "7.0.0"
    git = "https://github.com/Geal/nom"
    default-features = false
    features = []
    

    However this is still producing:

    error[E0463]: can't find crate for `std`
      |
      = note: the `riscv32imac-unknown-none-elf` target may not support the standard library
      = note: `std` is required by `memchr` because it does not declare `#![no_std]`
      = help: consider building the standard library from source with `cargo build -Zbuild-std`
    
    error: aborting due to previous error
    
    For more information about this error, try `rustc --explain E0463`.
    error: could not compile `memchr`
    

    Looking at the nom Cargo.toml features, it seems like turning off all default features (like I'm doing above) should turn off std in all deps. And it did help, as I fixed the error I was getting in the issue above but I now get an error for memchr.

    Any ideas why setting default-features = false wouldn't do the trick for memchr?

    opened by dougli1sqrd 16
  • Make character::*::alpha1(&str) accept all alphabet characters from UTF-8

    Make character::*::alpha1(&str) accept all alphabet characters from UTF-8

    alpha1 should accept any kinds of alphabetical characters, and because &strs natively support UTF-8, it is counterintuitive and just less useful that alpha1 only takes characters a-zA-Z (correct me if I'm mistaken).

    opened by mankinskin 16
  • nom consuming 100% cpu

    nom consuming 100% cpu

    I am exploring the possibility of switching to nom in a project I am working on. I am not fully familiar with nom yet, so please bear with me.

    For starters, I was trying to come up with a parser that matches strings of the form [a-zA-Z][-a-zA-Z0-9_]*. I wrote this:

    #[macro_use]
    extern crate nom;
    
    use std::str::from_utf8;
    
    use nom::{alpha, alphanumeric};
    use nom::{IResult, Needed};
    use nom::IResult::*;
    
    named!(identifier<&[u8], String>,
           chain!(
               h: map_res!(alpha, from_utf8) ~
               t: many0!(alt!(alphanumeric | tag!("-") | tag!("_"))),
               || {
                   let  s = h.to_string();
                   t.into_iter().fold(s, |mut accum, slice| {
                       accum.push_str(from_utf8(slice).unwrap()); accum })}));
    

    And I tested it with:

        #[test]
        fn id_name() {
            let a_setting = &b"miles"[..];
            let res = setting_name(a_setting);
            assert_eq!(res, Done(&b""[..], "miles".to_string()));
        }
    

    When I run cargo test my PC completely hangs. With top I can see that it starts consuming more and more CPU and memory until the entire system is completely unusable and I have to hard reset.

    Am I doing something wrong? Is this the best way to make a parser to match this type of strings?

    opened by filipegoncalves 15
  • error type names are confusing and hard to search and missing way to make them owned

    error type names are confusing and hard to search and missing way to make them owned

    • Err enum is a terrible name very confusing
    • Err is not in error module
    • ErrorConvert is not in error module
    • IResult is not in error module
    • Needed is not in error module
    • both to_owned() method now make no sense in v6 https://docs.rs/nom/6.2.1/nom/enum.Err.html#method.to_owned
    • no to_owned() wait to convert a Err<Error<&str>> to Err<Error<String>>
    • convert() from Error seem useless for user https://docs.rs/nom/6.2.1/nom/trait.ErrorConvert.html#tymethod.convert
    • most utils method don't work with new V6 default error type.

    And that just the few problem I hit doing a little parsing of bearer authorization http today. Keep it simple please, I already raise this problem several time in the past, it's worse in V6 ! Or I'm missing and so please enlighten me https://github.com/Geal/nom/blob/master/doc/error_management.md doesn't seem up to date.

    We really need clear example of how to make error type owned, clear guide line on how nom parser should handle error properly. Maybe some idea here https://github.com/rust-lang/project-error-handling

    opened by Stargateur 14
  • many_m_n can succeed when min > max

    many_m_n can succeed when min > max

    • nom version: 6.2.1
    • nom compilation features used: none

    The following code:

    fn main() {
      let res: nom::IResult<&str, Vec<char>> =
        nom::multi::many_m_n(4, 2, nom::character::complete::char('a'))("aaa");
      dbg!(res);
    }
    

    will succeed, consuming two 'a':

    [src/main.rs:4] res = Ok(
        (
            "a",
            [
                'a',
                'a',
            ],
        ),
    )
    

    While it is unlikely that someone would hardcode a minimum value greater than a maximum value, they can be the result of more complex code, and a call to many_m_n in such conditions should probably systematically fail (as the constraint of parsing something at least min but at most max times would be impossible to satisfy).

    Have a nice day

    needs testing 
    opened by carado 14
  • move usage towards AsChar method instead of functions under the character module

    move usage towards AsChar method instead of functions under the character module

    this reduces duplication a bit

    the implementation comes from:

    • https://github.com/epage/nom-experimental/pull/2
    • https://github.com/epage/nom-experimental/pull/25
    opened by Geal 1
  • New Combinator: discard_until / drop_until

    New Combinator: discard_until / drop_until

    This is a combinator I use all the time, might be useful to see something like it in this crate.

    It drops a byte at a time until the given parser matches, then returns the result.

    I don't do parsing in any really performance sensative contexts, this can probably be better implemented. This impl demonstrates the idea.

    fn drop_until<'a, T>(
        parser: fn(&'a str) -> IResult<&'a str, T>,
    ) -> impl FnMut(&'a str) -> IResult<&'a str, T> {
        map(many_till(take(1u8), parser), |(_, matched)| matched)
    }
    
    opened by Trequetrum 2
  • Capturing of expected values is inflexible

    Capturing of expected values is inflexible

    If you use char parser, nom automatically reports the expected char but many other parsers (e.g. one_of) don't. This leads to the use of alt((char(.), char(.))) in places where a one_of would make more sense / be cleaner.

    Ideally, the error reporting would be made more flexible, which will be made more complicated by #1580.

    A (very)( rough proposal

    • Split out VerboseErrorKind::Char to an enum Value with
      • Token(T)
      • Set(<I as IntoOutput>::Output)
      • Range(T.=T)
      • Next(usize) (how much of the Input is to be included)
      • Descriptor(&'static str) (for more descriptive names)
    • Rename FindTokens to TokenPattern to be more explicit in how its used
    • Add a TokenPattern::values iterator that returns all of the error reporting Values contained in it
    • Add TokenPattern::named that creates a NamedPattern that exclusive returns Descriptor
    • Extend the Error trait for adding expected and unexpected "Value"s
    • Add Parser::expected and Parser::unexpected combinators to allow a user to set this on any combinator explicitly

    Some inspiration is coming from combine

    • https://docs.rs/combine/latest/combine/trait.Parser.html#method.expected
    • https://github.com/Marwes/combine/issues/333
    opened by epage 2
  • Documentation example giving error

    Documentation example giving error

    I'm doing the example from the docs here: https://docs.rs/nom/latest/nom/combinator/fn.map_res.html

    let mut parse = map_res(digit1, |s: &str| s.parse::<u8>());
    

    That gives:

    error[E0283]: type annotations needed
       --> src/main.rs:31:24
        |
    31  |     let base = map_res(digit1, |s: &str| s.parse::<u32>());
        |                ------- ^^^^^^ cannot infer type of the type parameter `E` declared on the function `digit1`
        |                |
        |                type must be known at this point
        |
        = note: cannot satisfy `_: FromExternalError<&str, ParseIntError>`
    note: required by a bound in `map_res`
    

    No idea why the example from the documentation would be failing here.

    opened by alper 1
  • Err` value: Error(())`

    Err` value: Error(())`

    I keep getting this error; Err value: Error(())` Perhabs am doing something wrong. Please advise. Thanks in advance.

    parse_string taken from the examples

    fn main() {
      let dd = "\u{1e}";
      let result = parse_string::<()>(dd);
      println!("Result:\n\n{}", result.unwrap().1);
    }
    
    opened by uwejan 1
Owner
Geoffroy Couprie
Geoffroy Couprie
A fast monadic-style parser combinator designed to work on stable Rust.

Chomp Chomp is a fast monadic-style parser combinator library designed to work on stable Rust. It was written as the culmination of the experiments de

Martin Wernstål 228 Oct 31, 2022
A parser combinator library for Rust

combine An implementation of parser combinators for Rust, inspired by the Haskell library Parsec. As in Parsec the parsers are LL(1) by default but th

Markus Westerlind 1.1k Dec 28, 2022
A friendly parser combinator crate

Chumsky A friendly parser combinator crate that makes writing LL-1 parsers with error recovery easy. Example Here follows a Brainfuck parser. See exam

Joshua Barretto 2.4k Jan 8, 2023
A parser combinator for parsing &[Token].

PickTok A parser combinator like nom but specialized in parsing &[Token]. It has similar combinators as nom, but also provides convenient parser gener

Mikuto Matsuo 6 Feb 24, 2023
Website for Microformats Rust parser (using 'microformats-parser'/'mf2')

Website for Microformats Rust parser (using 'microformats-parser'/'mf2')

Microformats 5 Jul 19, 2022
A native Rust port of Google's robots.txt parser and matcher C++ library.

robotstxt A native Rust port of Google's robots.txt parser and matcher C++ library. Native Rust port, no third-part crate dependency Zero unsafe code

Folyd 72 Dec 11, 2022
Parsing Expression Grammar (PEG) parser generator for Rust

Parsing Expression Grammars in Rust Documentation | Release Notes rust-peg is a simple yet flexible parser generator that makes it easy to write robus

Kevin Mehall 1.2k Dec 30, 2022
LR(1) parser generator for Rust

LALRPOP LALRPOP is a Rust parser generator framework with usability as its primary goal. You should be able to write compact, DRY, readable grammars.

null 2.4k Jan 7, 2023
A typed parser generator embedded in Rust code for Parsing Expression Grammars

Oak Compiled on the nightly channel of Rust. Use rustup for managing compiler channels. You can download and set up the exact same version of the comp

Pierre Talbot 138 Nov 25, 2022
Rust query string parser with nesting support

What is Queryst? This is a fork of the original, with serde and serde_json updated to 0.9 A query string parsing library for Rust inspired by https://

Stanislav Panferov 67 Nov 16, 2022
Soon to be AsciiDoc parser implemented in rust!

pagliascii "But ASCII Doc, I am Pagliascii" Soon to be AsciiDoc parser implemented in rust! This project is the current implementation of the requeste

Lukas Wirth 49 Dec 11, 2022
PEG parser for YAML written in Rust 🦀

yaml-peg PEG parser (pest) for YAML written in Rust ?? Quick Start ⚡️ # Run cargo run -- --file example_files/test.yaml # Output { "xmas": "true",

Visarut Phusua 4 Sep 17, 2022
This project aims to implement a CSS(less like) parser in rust. Currently the code is targeting the PostCSS AST.

CSS(less like) parser written in rust (WIP) This project aims to implement a CSS(less like) parser in rust. Currently the code is targeting the PostCS

Huang Liuhaoran 21 Aug 23, 2022
MRT/BGP data parser written in Rust.

BGPKIT Parser BGPKIT Parser aims to provides the most ergonomic MRT/BGP message parsing Rust API. BGPKIT Parser has the following features: performant

BGPKIT 46 Dec 19, 2022
This project aims to implement a CSS(less like) parser in rust. Currently the code is targeting the PostCSS AST. Very early stage, do not use in production.

CSS(less like) parser written in rust (WIP) This project aims to implement a CSS(less like) parser in rust. Currently the code is targeting the PostCS

Huang Liuhaoran 21 Aug 23, 2022
A feature-few, no-allocation JSON parser in `no_std` rust.

Small JSON Parser in no_std This library reads and parses JSON strings. Its intended use case is to read a JSON payload once. It does not serialise da

Robert Spencer 18 Nov 29, 2022
A Gura parser for Rust

Gura Rust parser IMPORTANT: if you need to use Gura in a more user-friendly way, you have at your disposal Serde Gura which allows you to perform Seri

Gura Config Lang 21 Nov 13, 2022
Front Matter parser for Rust.

fronma Front Matter parser for Rust. Usage Add this crate as a dependency: [dependencies] fronma = "~0.1" then use fronma::parser::parse to parse text

Ryo Nakamura 6 Nov 19, 2021
A Rust crate for LL(k) parser combinators.

oni-comb-rs (鬼昆布,おにこんぶ) A Rust crate for LL(k) parser combinators. Main project oni-comb-parser-rs Sub projects The following is projects implemented

Junichi Kato 24 Nov 3, 2022