Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EP grammar changes. #223

Merged
merged 1 commit into from
Feb 28, 2017
Merged

Conversation

veelo
Copy link
Collaborator

@veelo veelo commented Feb 28, 2017

Grammar changes that lead to the successful parse of a 200KiB real-life Prospero Extended Pascal file.

You were asking about performance: The parse takes 13 minutes. The generated HTML file is close to 24 MiB large. I have never traced all of it, because my trace files passed the 1GiB mark and I am short on disk space.

So, it's not fast. To be fair, this grammar is bloated and could probably be optimised heavily for speed. But I like to stay close to the text of the standard, and it is not too slow for my application.

I have been delayed with the documentation that I promised, because it came to me that trying the rules in longest_match could possibly be parallelised, and how cool would that be? I almost have it working, but at least on OS X I get a bus error, possibly because of the recursive calls to longest_match (I counted 52 levels maximum in my test) and because of the small default stack space for threads on OS X. When I only parallelised the first call to longest_match it did work, and 3 minutes quicker, but it doesn't feel right to make a PR for that. And, I didn't take data sharing measures, so each thread is probably using its own memoisation table, which probably wastes a lot of work. So a real solution would involve a lot more I fear, and it is not going to be real fast anyway. But in case we need it, it is in my branch https://github.com/veelo/Pegged/commits/parallel.

…e Prospero Extended Pascal file.

 1. Make all spacing and comments optional.
 2. Use longest match where appropriate.
 3. Change order.
 4. Add Prospero extensions.
 5. Keyword literals.
@PhilippeSigaud
Copy link
Collaborator

Good, Ill merge that.

I played with parallelism when it arrived in Phobos a few years ago, thinking that many parts of parsing could be parallelized, particularly for ambiguous parsing when we have to try all branches in alternatives anyway (that was for a GLL parser).
IIRC, the time spend for managing the thread pool was bigger than the time gained by using multiple processors. The fastest parser for the most basic one: just spawn a new thread, without question, at each or/alt node. No pool, just bare spawning and receiving answers back from children threads.

@PhilippeSigaud PhilippeSigaud merged commit ad58513 into dlang-community:master Feb 28, 2017
@veelo veelo deleted the longest_match branch September 7, 2023 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants