-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Releasing MEDFORD 2.0 #18
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Creating new linereader object that appropriately creates lines objects. Lines objects contain lines from linereader that automatically parse the input line to remove inline comments, find used macros, etc.
- adding details (NovelDetailLine, ContinueLine collections) - adding blocks (detail collections)
Getting closer to a full reimplementation. Added the LineProcessor function to start collecting Line objects into logical collections. (e.g. NovelDetailLine + Continue Lines + NovelDetailLine form a Block) Also started a tmp_medford to hold a sketch of the logic of a standard run. Also created processed_lines to start collecting Line collections. Macro has already been moved; Block will follow.
Went ahead and moved Block and Detail to be beside Macro. Made Block and Detail subclasses of LineCollection. Detail now uses get_raw_content and get_content instead of payloads. processed_lines has been renamed to linecollections to better represent the purpose of these data types. LineProcessor has been renamed to LineCollector to better represent its purpose. Fixed an accidental bug in LineReader that caused it to treat everything as a Macro.
lines: renamed payload to raw_content for consistency... linecollector: additional logic to actually create the first block. * should go back and double-check this logic when writing linecollector tests. linecollections: minor bug fixes, such as: * correct var used for indexing * temporarily remove Detail.validate() until logic is written * set self.is_header as False when not a header Detail * fixed ^ use order of operations * Block now supports name-only blocks (e.g. placeholder blocks)
major_token was defined as a single string. This has been adjusted back to being major_tokens and List[str] to allow for the possibility of compounding tokens. (e.g. File-Primary, File-Remote, etc.) Started adding Dictionizer tools that, given a list of Blocks, converts them into the Dictionary format expected by Pydantic. Added some simple beginning Dictionizer tests.
Began rewriting Pydantic models to take advantage of new custom classes (e.g. Detail, Block) to store information like line #, etc. Bugfixes include: - changing references to Macro dict to be Dict[str, Macro] instead of Dict[str, str]. - add a headDetail attribute to Block to access detail that defines its name. - fix order of name, detail in Dictionizer. - return Detail when setting name attribute in Dictionizer.
Dictionizer's Dict[str, List[Dict[...]]] typing wasn't playing nice with Pydantic because of the fact that many models now have the 'Block' attribute, which is not a List of a Dict. Typing needs to be fixed later to properly represent the fact that Dict values can either be List[Dict[...]] or Block.
Separated "get content" from "resolve" for Macros. To get the content of a Macro after its macros have been replaced, use the "resolve" function. This is because Macros keep track as to whether or not they've been resolved before, to save processing time. For example, if two macros reference Macro1, the second time Macro1 has resolve() called, it should immediately return its .resolution attribute. Added a ludicrous amount of tests to try and make sure macro logic works. Removed test_obj_lineprocessor, because it's a holdover from before I renamed it to linecollector.
Added helper function for Blocks to provide a str version of their major token chain. Started adding command line arguments for actually running MEDFORD in the temporary new MEDFORD main file. Adjusted LineCollector to separate named blocks by major token. Now, two blocks with different major tokens may share a name. Added functions to LineCollector to provide all blocks and to provide macros, rather than the main function having to scoop them out itself.
Forgot to update LineCollector tests to use new internal representation of LineCollector. Fixed bug where LineCollector never instantiated sub-Dict of self.named_blocks() when trying to add blocks.
Figured out how to make an Enum accept different cases. Now can use any capitalization of a known Mode for setting the -m when running MEDFORD.
This removes the \n and spaces leading and tailing output that is passed to Pydantic.
ErrorManager is now called from the Medford script itself, if an error is encountered while pydantic is parsing the dictionary representation. ErrorManager now has some concept of Pydantic errors. Adjusted Block to no longer be a subclass of LineCollection since it shouldn't have the same properties (e.g. HeadLine, it should instead ask its head detail for its head line, if necessary).
Old tests have been moved to DEPRECIATED_tests folder because they no longer compile due to Pydantic updating to version 2. Began implementing @-@ capabilities. Involves: - new Line type - new regex recognition in LineReader - New collection type in LineCollections ('AtAt') - New validate_atat function for all LineCollections. Returns True except for the AtAt type, which actually validates. - Dictionizer now also takes a dictionary of strs (names) to Blocks on initialization - ... Which is now output from LineCollector, using the 'get_1lvl_blocks' function. - Dictionizer got a new validate_atat function that is called from generate_dict. Also, additional fixes: - Fix for Details not storing has_macro and used_macro_names information. - renamed used_macros to used_macro_names for consistency in Block objects. TODO: add tests that actually test the @-@ validation functionality.
LineCollection __eq__ now properly compares macro usages. AtAt detection regex now uses the right string termination flag. Getting the content of a line now defaults to removing the inline comment.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR to move all the development that has been done on medford parser 2.0 into the main branch, such that we can work on reorganizing the repository for future development.
The core logic of the medford parser has been completely redone to (hopefully) be more object-oriented and amenable to adding features. A document fully describing these changes will be coming in the near future.
Importantly to note, this will break all external tool support, which I believe is currently only the medford vscode extension (described here), as well as its associated LSP (described here). This will hopefully be fixed in the medford parser v2.1.