nom-rule
A procedural macro for defining nom combinators in simple DSL. Requires nom
v5.0+.
Dependencies
[dependencies]
nom = "7"
nom-rule = "0.2"
Syntax
The procedural macro rule!
provided by this crate is designed for the ease of writing grammar spec as well as to improve maintainability, it follows these simple rules:
TOKEN
: match the token by token kind. You should provide a parser to eat the next token if the token kind matched. it will get expanded intomatch_token(TOKEN)
.";"
: match the token by token text. You should provide a parser to eat the next token if the token text matched. it will get expanded intomatch_text(";")
in this example.#fn_name
: an external nom parser function. In the example above,ident
is a predefined parser for identifiers.a ~ b ~ c
: a sequence of parsers to take one by one. It'll get expanded intonom::sequence::tuple
.(...)+
: one or more repeated patterns. It'll get expanded intonom::multi::many1
.(...)*
: zero or more repeated patterns. It'll get expanded intonom::multi::many0
.(...)?
: Optional parser. It'll get expanded intonom::combinator::opt
.a | b | c
: Choices between a, b, and c. It'll get expanded intonom::branch::alt
.&a
: Positive predicate. It'll get expanded intonom::combinator::map(nom::combinator::peek(a), |_| ())
. Note that it doesn't consume the input.!a
: Negative predicate. It'll get expanded intonom::combinator::not
. Note that it doesn't consume the input.... : "description"
: Context description for error reporting. It'll get expanded intonom::error::context
.
Example
Define match_text
parser and match_token
parser for your custom token type. You can use nom::combinator::fail
as match_token
if your parser use &str
or &[u8]
as input because you won't match on token kinds.
#[derive(Clone, Debug, PartialEq)]
struct Token<'a> {
kind: TokenKind,
text: &'a str,
span: Span,
}
#[derive(Clone, Copy, Debug, PartialEq)]
enum TokenKind {
Whitespace,
// Keywords
CREATE,
TABLE,
// Symbols
LParen,
RParen,
Semicolon,
Comma,
Ident,
}
fn match_text<'a, Error: ParseError<'a>>>(
text: &'a str,
) -> impl FnMut(Input<'a>) -> IResult<'a>, &'a Token<'a>, Error> {
move |i| satisfy(|token: &Token<'a>| token.text == text)(i)
}
fn match_token<'a, Error: ParseError<'a>>>(
kind: TokenKind,
) -> impl FnMut(Input<'a>) -> IResult<'a>, &'a Token<'a>, Error> {
move |i| satisfy(|token: &Token<'a>| token.kind == kind)(i)
}
Then give the two parser to nom_rule::rule!
by wrapping it into a custom macro:
macro_rules! rule {
($($tt:tt)*) => {
nom_rule::rule!($crate::match_text, $crate::match_token, $($tt)*)
}
}
To define a parser for the SQL of creating table:
let mut rule = rule!(
CREATE ~ TABLE ~ #ident ~ "(" ~ (#ident ~ #ident ~ ","?)* ~ ")" ~ ";" : "CREATE TABLE statement"
);
It will get expanded into:
let mut rule =
nom::error::context(
"CREATE TABLE statement",
nom::sequence::tuple((
(crate::match_token)(CREATE),
(crate::match_token)(TABLE),
ident,
(crate::match_text)("("),
nom::multi::many0(nom::sequence::tuple((
ident,
ident,
nom::combinator::opt((crate::match_text)(",")),
))),
(crate::match_text)(")"),
(crate::match_text)(";"),
))
);
Auto Sequence (nightly only)
nom-rule
is able to automatically insert ~
in the rule when necessary so that you get the example above working the same as the following:
let mut rule = rule!(
CREATE TABLE #ident "(" (#ident #ident ","?)* ")" ";" : "CREATE TABLE statement"
);
To enable this feature, you need to use a nightly channel rust complier, and add this to the Cargo.toml
:
nom-rule = { version = "0.2", features = ["auto-sequence"] }