The WAT (WebAssembly Text Format) parser in wasm-language-tools v0.5 or before was not fast enough. Recently I have rewritten the parser from scratch, and the performance has been increased by 350% in the benchmark.Let me share how I optimized it.Use hand-written parserThe old parser was written with winnow which is a parser combinator library.While it鈥檚 easy to create a parser with parser combinators, it鈥檚 generally slower than a hand-written parser,so the first step is to write the parser by hands. Hand-written parser is not only faster but also allows to do more optimizations in the future.Clone well-known green tokens and green nodesThere鈥檙e many parentheses and keywords in WAT. For these tokens and nodes, they shouldn鈥檛 be created again and again when parsing.Looking into the implementation of rowan::GreenToken and rowan::GreenNode, there鈥檚 a Arc inside,so we can prepare these well-known tokens and nodes in advance, then put them into LazyLock one by one, and clone them when needed.Keyword matchingThere鈥檙e many keywords in WAT such as module, func, param, result, etc.When recognizing keywords in lexer, we don鈥檛 capture a word and then check it by comparing strings.Instead, we check the prefix of source code in bytes:self.input.as_bytes().starts_with(keyword.as_bytes())However, there may be a word like function that starts with func but it isn鈥檛 a keyword, so we must check the next character is not an identifier character.Use get_unchecked to create tokenExcept strings and comments, other kinds of tokens are just ASCII strings.For these tokens, we can use get_unchecked to avoid unnecessary UTF-8 boundary check which get will do.Use our own Token typeThe lexer will produce tokens in our own Token type instead of rowan::GreenToken,because creating rowan::GreenToken is much more expensive, and we should create it only when needed.The Token type is simple as below:struct Token<'s> { kind: SyntaxKind, text: &'s str, }For convenience, I added impl From<Token<'_>> for ...
First seen: 2026-01-20 09:33
Last seen: 2026-01-20 22:35