Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider inlining accepting states without transitions #7

Closed
osa1 opened this issue Jun 1, 2021 · 1 comment
Closed

Consider inlining accepting states without transitions #7

osa1 opened this issue Jun 1, 2021 · 1 comment
Labels
perf Related to runtime performance

Comments

@osa1
Copy link
Owner

osa1 commented Jun 1, 2021

I will come up with a minimal repro, but here's an example for now:

match self.state {
    ...
    56usize => match self.iter.peek().copied() {
        ...
        Some((char_idx, char)) => match char {
            '\u{5c}' => {
                self.current_match_end += char.len_utf8();
                let _ = self.iter.next();
                self.state = 58usize;
            }
            ...
        },
    },
    ...
    58usize => {
        let rhs: fn(LexerHandle<'_, 'input>) -> LexerAction<Token> =
            |lexer| lexer.switch(LexerRule::StringEscape);
        let str = &self.input[self.current_match_start..self.current_match_end];
        let handle = LexerHandle {
            iter: &mut self.iter,
            match_: str,
            user_state: &mut self.user_state,
        };
        match rhs(handle) {
            LexerAction::Continue => {
                self.state = self.initial_state;
                continue;
            }
            LexerAction::Return(tok) => {
                self.state = self.initial_state;
                return Some(Ok((
                    self.current_match_start,
                    tok,
                    self.current_match_end,
                )));
            }
            LexerAction::Switch(rule_set) => {
                self.switch(rule_set);
                continue;
            }
            LexerAction::SwitchAndReturn(tok, rule_set) => {
                self.switch(rule_set);
                return Some(Ok((
                    self.current_match_start,
                    tok,
                    self.current_match_end,
                )));
            }
        }
    }
    ...
}

State 56 has a transition to 58. State 58 is an accepting state and it doesn't have any transitions.

In general, when a state is an accepting state and doesn't have any transitions we can "inline" it and avoid (1) extra states (2) redundant looping (no need to update the state, continue).

In the example state 58 is only referred by 56. In general I'm guessing this doesn't have to be the case, but I suspect it may still be OK to inline accepting states.

@osa1 osa1 added the perf Related to runtime performance label Jun 1, 2021
@osa1
Copy link
Owner Author

osa1 commented Jun 3, 2021

This is implemented in dfa_opt. The code is currently terrible, I'll refactor it and submit a PR. cargo bench reports:

Lex Lua files           time:   [2.3897 ms 2.3930 ms 2.3970 ms]
                        change: [-15.060% -14.908% -14.753%] (p = 0.00 < 0.05)
                        Performance has improved.

Code size grows 1.4% (10176 lines to 10328). Number of states down 20% (150 to 119).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf Related to runtime performance
Projects
None yet
Development

No branches or pull requests

1 participant