-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PVL Read ISIS Control network #98
Comments
This is a fun one. So this happens:
The fact that the PVL and ISIS dialects will load it, but the Omni dialect doesn't is ... disturbing. Turns out I'm pretty sure it is the PVL-text's fault and not the pvl module's fault, but in the short term, you can use the ISIS or PVL grammars like this to get going (but this is really just an ugly hack):
Now, to the solution: I'm not sure why your exception isn't showing the line number like the ones I'm getting from my Why is it trying to do that? Character sets! What do the PVL and ISIS Why does this matter for your PVL-text? Well, it turns out that the character right after the "D" in "END" which would appear to finish the PVL-text in the header of the file is a NULL character (ASCII 0, Hex 0x0) which is part of ASCII, but one of the characters that the PVL spec expressly forbids (but the ODL/PDS3 can handle it, and its is certainly a valid UTF character for the OmniGrammar). So the PVL/ISIS grammars come across this character and say "whoa, I can't deal with this character, so I'm going to consider this point the end-of-file, does what I have so far make sense as PVL?" and since the last recognizable character it grabbed was the "D" in "END," the preceding is all good, it happily returns a dict-like. When the ODL/PDS3/Omni grammars come along, the last recognizable token that they could swallow was "End_Object = ProtoBuffer" and the next one they tried to parse starts with "END" but is followed by a bunch of completely parseable characters but then there is a double-quotation character which can't be in a PVL Aggregation Block, Assignment Statement, or End Statement and so throws the error. This changed with pvl 1.3.0, prior to that the OmniGrammar had the same restrictive character set as original PVL itself, and would have succeeded on this PVL-text. What's the right long-term solution? I think that the change to the OmniGrammar approach was correct, and should stay. I think that the pvl module is correctly handling things. I think that whatever is writing these control networks needs to stop writing broken PVL-text. Here's the problem, just the characters "END" aren't enough. What if there was PVL-text like this:
If the only thing the parser did was look for that three-character sequence, the returned dict-like wouldn't have the "ENDED" key-value pair. The next thing after the "END" is important. According to the original PVL-spec It can be one of these four things:
The problem is that whatever is writing that file just starts adding NULL characters after the "D" in "END" and since that character happens to be on the PVLGrammar's no-parse list, condition 4 gets triggered, but kind of by accident. The better thing to do is put a semi-colon after the END or inject some proper whitespace. |
@rbeyer This is great information both to get this working and to get an issue opened on the ISIS repo in order to consider a change over there to how the PVL header to an ISIS control network is being written. |
Closing this Issue, but glad to see other Issues linked to it. |
I've been thinking more about this. There is a solution that would allow this PVL-text to be loaded by the Omni dialect (the default strategy used to parse PVL-text when the user doesn't specify). That solution would be to disallow recognition of ASCII NULL characters ( That is fundamentally what happens in the PVL and ISIS dialects, and why they will load this PVL-text. The pvl 1.3 change was to remove all character forbiddence and take any UTF character and parse until the PVL-text ended according to the various PVL rules. As I stated above, my thinking was that this approach would be as broad as possible, but maybe that is indeed the wrong perspective? Changing this approach for the default Omni strategy means that ASCII NULL characters could not be a part of PVL-text parameters, values, or comments, because as soon as an ASCII NULL were encountered, PVL-text parsing would terminate. Is this a big loss? Probably not, especially since that is how this library largely functioned for a long time anyway, and who in their right mind would include an ASCII NULL character intentionally as part of PVL-text that is meant to be recovered? My suspicion is that ASCII NULLs are only injected mistakenly, perhaps with the idea that they were "terminating" PVL-text parsing. Making this change would still allow the wide rainbow of UTF characters to be parsed by the default strategy and only exclude the ASCII NULL, which 99% of the time (like in this example) is meant to (incorrectly) signal the end of PVL-text parsing anyway. I imagine this would make Jay happy (but the ISIS code should shape up its PVL-text writing anyway), but can anyone think of a reason why we shouldn't exclude ASCII NULLs? |
LGTM |
I agree 100% that ISIS should clean up it's PVL and be compliant to a written, and widely adopted and supported, standard! |
Describe the bug
Unable to read an ISIS control network after upgrading to > 1.0. In the past, I have been able to pass a full ISIS control network to pvl and have it read the header off the network. The protobuf internals are outside the scope of the pvl library.
To Reproduce
Using the attached control network run a pvl.load(). The attached network is zipped so that GitHub will accept it as a filetype.
test.net.zip
Expected behavior
A clear and concise description of what you expected to happen.
Error logs or terminal captures
Your Environment (please complete the following information):
pvl
Version 1.3.0Additional context
I am not 100% sure that this is a PVL issue or an issue with the mocked control network. The Aggregation Block and lexor errors are opaque to me though, so I am looking for some additional guidance.
The text was updated successfully, but these errors were encountered: