Rewrite the low-level SMT parser #35

AdrienChampion · 2022-08-15T14:47:58Z

SmtParser uses a handmade parser that is showing its limits more and more (#33)

Would love to have something cleaner, though it would need to be as efficient as the current parser. Needs investigation (peg, nom, ?) and some benchmarking.

The text was updated successfully, but these errors were encountered:

filippodebortoli · 2022-12-01T18:47:57Z

A PEG for SMT-LIB 2.6 (syntax up to Section 3.5 of the standard document) in pest could look like this:

WHITESPACE = _{ "\t" | "\n" | "\r" | " " }
COMMENT = _{ ";" ~ ANY* ~ NEWLINE }

printable_char = @{ '\x20'..'\x7e' | '\x80'..'\x8f' }
numeral = @{ "0" | ('1'..'9') ~ ASCII_DIGIT* }
decimal = @{ numeral ~ "." ~ "0"* ~ numeral }
hexadecimal = @{ "#x" ~ ASCII_HEX_DIGIT+ }
binary = @{ "#b" ~ ('0'..'1')+}
escaped_double_quote = _{ "\"\"" }
string = @{ "\"" ~ (printable_char | WHITESPACE | escaped_double_quote)* "\"" }
reserved_word = @{ "BINARY" | "DECIMAL" | "HEXADECIMAL" | "NUMERAL" | "STRING" | "_" | "!" | "as" | "let" | "exists" | "forall" | "match" | "par" }
char_list = { "~" | "!" | "@" | "$" | "%" | "^" | "&" | "*" | "_" | "-" | "+" | "=" | "<" | ">" | "." | "?" | "/"}
sym_simple = @{ (ASCII_ALPHA | char_list) ~ (ASCII_ALPHANUMERIC | char_list)* }
sym_quoted = @{ "|" ~ (!("\\" | "|") ~ (WHITESPACE | printable_char))* ~ "|"}
symbol = { sym_simple | sym_quoted }
keyword = {":" ~ sym_simple }
spec_constant = _{ numeral | decimal | hexadecimal | binary | string }
s_expr = @{ spec_constant | symbol | reserved_word | keyword | "(" ~ s_expr* ~ ")" }
index = { numeral | symbol }
identifier = { symbol | "(_ " ~ symbol ~ " " ~ index+ ")" }
attribute_value = @{ spec_constant | symbol | "(" ~ s_expr* ~ ")" }
attribute = @{ keyword | keyword ~ " " ~ attribute_value}
sort = @{ identifier | "(" ~ identifier ~ " " ~ sort+ ")" }

The next steps would be to encode syntax for: Terms and formulas, theory declarations, logic declarations, scripts.
While writing the grammar file could be somewhat easy, I'd expect the task of rewriting the parsing functions to be cumbersome.

AdrienChampion · 2022-12-06T10:35:18Z

@filippodebortoli are you confident that peg would be as efficient as a handmade parser?

Generally one ignores such concerns because printing/parsing time should be (very) small compared to solving time. The issue is that efficient verification tools go to great length to issue checks that are as small as possible, in some cases extremely small.

Usually, this means that the tool issues a lot of small checks as opposed to one/a few big one(s). In this context I think parsing time can be a concern, but that's just my opinion 🐱

filippodebortoli · 2022-12-06T14:22:06Z

@filippodebortoli are you confident that peg would be as efficient as a handmade parser?

I am not confident enough to claim that it would yield a parser that competes with or outperforms the current, handmade one. I have the feeling that it would simplify a lot the parser's implementation and make it easier to maintain.
There is one clear drawback: the parser provided by pest only accepts str as input, which is far more restrictive than what SmtParser currently allows for.

AdrienChampion · 2022-12-10T09:16:02Z

I wrote this a looonnng time ago, I need to take a look but right now I lean more towards keeping the handwritten parser, especially since it's quite small. I don't remember what the parser's code looks like but I expect that its maintainability can be improved greatly :D

AdrienChampion added the help wanted label Aug 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite the low-level SMT parser #35

Rewrite the low-level SMT parser #35

AdrienChampion commented Aug 15, 2022

filippodebortoli commented Dec 1, 2022

AdrienChampion commented Dec 6, 2022

filippodebortoli commented Dec 6, 2022

AdrienChampion commented Dec 10, 2022

Rewrite the low-level SMT parser #35

Rewrite the low-level SMT parser #35

Comments

AdrienChampion commented Aug 15, 2022

filippodebortoli commented Dec 1, 2022

AdrienChampion commented Dec 6, 2022

filippodebortoli commented Dec 6, 2022

AdrienChampion commented Dec 10, 2022