-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python runtime version check is slow #804
Comments
A common way to speed-up is to use |
In the event that KOLANICH's observation bears no fruit....
This seems like a very good approach to me -- all that's missing is a custom comparison function which compares the numbers as numbers and does something intelligent if it encounters string components or suffixes. This saves you from having add an attribute and having to change the runtime to support both the "current" approach and the "legacy" approach. |
... does it do that at import time? To clarify, the ksc-generated version check uses $ python3 -m timeit --setup='import pkg_resources' -- 'pkg_resources.parse_version("0.9")'
20000 loops, best of 5: 15.2 usec per loop The performance problem here is importing $ for i in 1 2 3 4 5; do python3 -m timeit --number=1 --repeat=1 -- 'import pkg_resources'; done
1 loop, best of 1: 179 msec per loop
1 loop, best of 1: 209 msec per loop
1 loop, best of 1: 227 msec per loop
1 loop, best of 1: 185 msec per loop
1 loop, best of 1: 199 msec per loop (imports get cached by the Python interpreter, so to see the actual import time I have to start a new Python interpreter each time and can't use |
The problem is that the version check/comparison code is placed directly in the source code of each generated Python file, so it should be kept short/compact - there certainly shouldn't be any long/complex function definitions. Also, if there's a bug in the generated version comparison code, the only way to fix it is to update the compiler and ask users to regenerate all their parsers (or patch the generated code by hand). This is of course why normally you don't implement this kind of logic by hand and instead use a library. The problem is that we can't put the relevant code into the |
@dgelessus, you're quite right. (Although, I'm not sure that expecting people to update the compiler and runtime together is unreasonable.) And, I concur that having the comparison function, if it is non-trivial, placed in the runtime would be better. But, this basic approach has enough advantages to make it worth considering. I imagine that, if we wanted to, we could come up with some clever ways for the generated code to interrogate the features of the runtime. For instance, I suspect that it could check if the runtime has a "check version function", and, if it does not, it could fall back to the present mechanism. However, if you're willing to permit a dependency on Python regular expressions, this might be sufficient:
I think that covers all but the most obscure of cases. (I.e., it doesn't cover "post release" numbers, and it may get "release candidate" vs. "development" releases wrong, but otherwise I think it covers the first nine "alpha", "beta", and "release candidate" versions correctly, in addition to all the normal cases.) |
It seems that yes, though from the code it is not obvious.
|
IMHO an overkill to use |
Yes! Much better:
|
The problem is that the version check can't assume that the runtime is up-to-date - its entire purpose is to check for outdated runtime versions and give a user-friendly error message in that case.
This is more or less what I was suggesting with my fourth solution :) If On that topic - if we do use that solution, we should perhaps use a different name than
I don't think that's a good idea... IMHO either we should support pre/post-releases correctly, or we shouldn't support them at all and hard-error out if they are encountered in the version string. I don't think it's very useful to only handle some pre/post-release combinations correctly and return wrong results for other combinations. Also, the suggested parsing code (both the Alternatively we could decide that the Python runtime version should only ever use numbers and dots, but I think it's a bad idea to assume that - especially if at some point we decide to use a tool like (This is another reason why I suggested adding a new attribute for the compatibility check. That way we only need to keep that attribute in a format that is compatible with the version check, and the The version check can't depend on any external libraries (see my correction in the top comment), and that module is definitely too long to be included directly in the generated code. |
I know that it is inacceptible to use a dependency (it can be vendored, but it is still too large for just a version check) for just a version check. In fact I have developed it for a package management solution. It was just mentioned to make sure at least someone heard about it :)
hmm, good catch. How about just cutting the part before the first non-digit non-dots, and then splitting it by dots? |
As an off-though — any chance switching to something like what Perl does will help the performance in this case? I.e. instead of having weird version formats and non-trivial comparison algorithms, we can just formalize everything to |
I have a similar opinion - as @dgelessus pointed out, one can also compare tuples in Python, which seems to be a better alternative to an integer. AFAIK all languages other than Python store the runtime version in a single integer, so why would we need to introduce an algorithm for parsing and comparing versions stored in a string? What's the benefit of doing that? I vote for this option:
However, I'm quite stunned by what @dgelessus said about it:
How is it possible that a simple comparison of two tuples takes 40 ms? Can you please expand it and show the benchmark, @dgelessus? If the problem is in tuples, does the substitution with plain integers help? My common sense says that parsing version from a string must necessarily take much longer than comparing any two primitives in Python. |
It doesn't, don't worry 🙂 I should have worded that more clearly - 40 ms isn't the execution time of the version check, but the execution time of the entire command ( PS: And yes, if Python didn't have built-in support for comparing tuples, I would have probably suggested a format based on a single number instead. But Python has proper tuple comparison that is also very fast, and with it you don't need to worry about adding leading/trailing zeroes in the right places. |
@dgelessus Oh, I'm sorry, I missed the little no there 😆 |
Someone should probably define the problem scope with a little more specificity -- for instance, are we only worrying about Python, here? My previous suggestion was made under the guiding principles of simplicity with an eye toward an 80% solution. That is, there is nothing which says that Kaitai Struct has to make full use of the Python Version Identification specification -- especially if/when it's not used for other languages which KS supports. If you limit the version spec to On the other hand, if you're going to define a new attribute, why define it to be a data value? That imposes all sorts of constraints. Why not define it to be a "callable". With that, the generated code can provide whatever value the compiler sees fit, and the runtime has full freedom to decide whether it's compatible (i.e., the argument could be an integer, a string, a tuple, and, better yet, it could change over time without breaking compatibility). |
Yes - the performance problem here comes from the use of
So other languages probably don't have performance problems with their version checks. (If they do, that would be worth opening a separate issue.)
I don't think it's a good idea to assume that, especially in generated code that is very hard to update in the future. See my previous comment in #804 (comment):
Also, "simplicity" is a bit subjective here. I would argue that a solution that requires parsing strings is less simple than one that just compares tuples. I'm also curious what the motivation is for not adding a new attribute to the runtime. It's a very small change and doesn't affect any existing users of the runtime. The compatibility checks based on the new attribute are also very easy to make backwards-compatible - if the new version attribute is missing, we know that the runtime is incompatible.
I thought about that a bit too, and it would allow some more flexibility - for example a new runtmie could retroactively stop supporting old compiler versions. It would also be more complex though, both in the runtime and in the generated code (because you still need to raise a user-friendly error for old runtimes that don't have the function). And even if we do provide a function like that, I think the runtime should still provide the version information as a machine-readable attribute, so that the compiler can make its own decisions based on the actual version number (for example if the compiler ever gets the ability to target older runtimes). |
The previous implementation used pkg_resources.parse_version to parse the kaitaistruct module's __version__ string. Unfortunately, importing pkg_resources is relatively slow, which had a noticeable performance impact on every ksc-generated parser. The new implementation uses the newly added kaitaistruct.API_VERSION attribute, which stores the runtime version as a tuple of ints. In this format the version information can be compared directly without having to parse it from a string first. This allows removing the import of pkg_resources, which fixes the performance problem. Part of the fix for kaitai-io/kaitai_struct#804.
@dgelessus Can you please add the patch version number to the The checks in all other languages have the patch version (usually there is a integer with format Even if we didn't want to stick to semver, it's always good to have some magic variable to tweak in case something goes wrong 😛 Hey, you have no idea how much I look forward to a day when we'll be using semantic versioning and releasing several patch/minor versions per week. That would make the life a lot easier 😎 |
But we currently know what all of the existing generated code looks like now, right? And, presumably, you can control what it will look like in the future. So, it's not unreasonable to make a policy statement like, "Henceforth, versions shall be in
True. However, I would suggest that adding an
In terms of the runtime, I don't think the added complexity is a real problem, particularly when balanced against the power and flexibility that it affords (and, given that it can be relatively easily fixed if it doesn't work as hoped). As for the generated code, the difference is very small: you're already proposing code which handles the situation if the attribute is not present; to use the attribute as a callable simply means using the result of the call if the attribute is present instead of doing the comparison directly (it's a couple of dozen extra characters...).
That's fine, but it raises an interesting issue: is there a requirement that the runtime be available when the compiler is run? What about the situation of building on one platform for deployment on another? It strikes me that the compatibility check can and should only be made at runtime, not at compile-time. Just out of curiosity, if all the other languages use a single integer for the version, why does Python use a different approach? (It seems like the issue of the runtime compatibility check should be language-independent, in which case having a language-specific solution seems weird.) |
@generalmimon I could do that, but I'm not sure how it would be useful... the current version number is 0.9, not 0.9.0, so the correct tuple representation is
As I explained before, there are tools in the Python ecosystem that automatically generate version numbers, and they can and will use version strings that don't fit the simple This is why I suggested adding a new separate attribute (
As I understand it, the runtimes are intended only for use by ksc-generated code, so their interface doesn't really need to be documented or supported for external users.
OK, I didn't word that very precisely - I meant that the compiler-generated code should be able to make its own decisions based on the version number, not the compiler itself. There is of course no requirement that any language-specific runtime (or the language itself even) is available when running the compiler.
They don't - C++ uses an integer constant that is checked manually with an I think I already said this in another issue, but just because KS targets many languages doesn't mean that every feature needs to be implemented 100 % identically in every language - instead it should be implemented in a way that best fits each target language while giving consistent behavior across languages. So I think that as long as the compatibility checks behave the same way in all languages (e. g. 0.8-generated code works with runtimes 0.9 and 0.10 but not 0.7), it shouldn't matter if the version is stored as an integer, double, tuple, string or whatever. (By the way, comparing tuples to check version compatibility is not unusual in Python - it's relatively common to write something like |
I think you're both right: "0.9" implies "0.9.0", and it behaves as such when compared as tuples to other versions (obviously, though,
I apologize: I wasn't aware that you didn't not necessarily have control over the format of the version spec.
Fair enough, but there still has to be some description of how the compiler and runtime work together so that they can be maintained and enhanced, and having the magic cookie passed between two separate components makes it harder to maintain than it would be if you had all the magic encapsulated inside one of them.
OK, but that kind of flies in the face of your earlier arguments about needing that code to be so simple because it cannot be changed without requiring the end-user to rerun the compiler.
I certainly don't disagree. I was keying off of an earlier comment by @generalmimon. If a particular practice is common, it's usually best to maintain it across scenarios; if, however, there is sufficient justification, then a custom solution should be used. From what you say, though, there isn't much commonality, so there is no reason not to take advantage of the language capabilities.
No, I like that approach (as I suggested initially). What I don't like is enshrining it in the generated code if its practical to encapsulate it in the runtime. |
To summarize a bit (since there hasn't been any more discussion here recently) - as I understand it, everyone is okay with the general idea of replacing the version string parsing with preparsed versions in tuple format. The remaining open questions are:
(if I forgot anything, please comment!) I don't have a strong opinion on the first point, because either way would work fine, and you can freely switch between the two variants across releases. I slightly prefer the About the second point: the main advantage of abstracting the check behind a function is that it gives more flexibility to future versions of the runtime, by allowing them to reject specific old compiler versions and/or generate better error messages than the default "Incompatible Kaitai Struct Python API". A disadvantage is that if the runtime provides only a function for checking the version, the compiler-generated code can't perform any more advanced version checks in the future (e. g. disallowing runtimes with a higher number in the first version part). This can be easily fixed by providing both I think a more important question is whether any of this flexibility will ever be actually used. I think it's relatively unlikely, simply because no other runtime has this kind of advanced logic for version checking, and instead they all do a simple check that blocks all runtimes older than the compiler version. So even if (for example) at Kaitai Struct version 3.0 we decided to drop compatibility with code generated by compiler versions older than 1.0, none of the runtimes (except for Python) would be able to check for that, and we'd have to explain this breaking change to users in some other way (probably by announcing it in the release notes and explaining it in the docs and FAQ - which is good practice anyway). A different question - since the 0.9 release is relatively close, would it be realistic for this change to be included in 0.9 still? It will be a relatively small change (no matter how exactly we implement it in the end) that leads to a significant performance improvement (for small scripts at least), so it would be very nice to have it in the next release. Then I could finally stop using a patched unreleased ksc version for everything 🙂 |
IMHO for a small function called once and not poinsoning namespace it completely doesn't matter from usability PoW. From performance PoW inlined should be a bit faster. IMHO - we should inline if it is just tuple comparison, otherwise should have a function for version com-arison, but the check is just an inlined if. This approach combines the worst parts of both: worse performance because of a function call in a wrong place and the comparison call is not a function itself. A better approach would be to have 2 functions - one for comparison and one for raising. Having only one check function with comparison inlined into it is also possible, but is just ugly.
Why not? KS is not yet stabilized. |
If you proceed with encapsulating the check in a runtime function, then I think the question about how to format the version tuples becomes effectively moot -- it's an implementation detail in the runtime, which can easily be changed without breaking anything. Assuming that the runtime function would do the same thing as the generated code, I would think that any performance difference between them would be "lost in the noise", given that it's a one-time check. (E.g., I would expect that the cost of a procedure call would on the order of the cost of comparing a multi-part tuple -- so you're taking something that we think is cheap and making it half as cheap but still not expensive (e.g., no file system operations!)...in exchange for significant flexibility.) I have to admit that I don't follow @KOLANICH's concerns about the performance: this is a one-time execution in the context of running an entire application -- I expect that it will be impossible to measure whether it is in-line or reached via a procedure call unless you strip out everything but the check....
Yes, but updates to the runtime could provide that support, and then it would be available to all generated code. (And, the initial release of the runtime function should probably return a failure if the first number is different, since that's generally what that difference means....)
I suggest that that complexity isn't warranted -- it would only be helpful in a situation where a new compiler is trying to work around a problem with an old runtime...wouldn't it be better to just have the user update their runtime? If you really feel compelled to offer (I'm going to abstain on the question of whether this should go into 0.9...but I will offer that the sooner this is settled, the better.... ;-) ) |
…ts (#49) * Add API_VERSION constant storing the runtime version as a tuple of ints Part of the fix for kaitai-io/kaitai_struct#804. * Replace shortened GitHub issue ref with full URL (for convenience) Co-authored-by: Petr Pucil <[email protected]>
The previous implementation used pkg_resources.parse_version to parse the kaitaistruct module's __version__ string. Unfortunately, importing pkg_resources is relatively slow, which had a noticeable performance impact on every ksc-generated parser. The new implementation uses the newly added kaitaistruct.API_VERSION attribute, which stores the runtime version as a tuple of ints. In this format the version information can be compared directly without having to parse it from a string first. This allows removing the import of pkg_resources, which fixes the performance problem. Part of the fix for kaitai-io/kaitai_struct#804.
This can now be closed. |
@dgelessus Uhhh, not sure why I haven't realized this before releasing 0.10 (seems pretty obvious now), but the version check introduced in KSC 0.10 effectively doesn't do anything (and I see it's ultimately my fault, I'm sorry for that): if getattr(kaitaistruct, 'API_VERSION', (0, 9)) < (0, 9):
raise Exception("Incompatible Kaitai Struct Python API: 0.9 or later is required, but you have %s" % (kaitaistruct.__version__)) This check (in this form) will not fail for any past version of the Python runtime library, so it's 100% useless. 0.9 and older runtimes don't have the This is a result of my commit kaitai-io/kaitai_struct_compiler@c4cb01b, where I bumped the versions since the new version check didn't get into 0.9. If I didn't try to be creative with "let's allow even the 0.9 runtime" and bumped the val minimalRuntime: KSVersion = KSVersion.fromStr("0.9") the version check would have worked now. On the other hand, it's still probably better to have this new "check" which doesn't work (but at least doesn't bother anyone) in 0.10 than to be still stuck with the old version check from 0.9 that worked but imported And this new check will be fixed once we bump |
0.11 will definitely include some changes that require an updated runtime - for example kaitai-io/kaitai_struct_cpp_stl_runtime#67 or the fact that (once the `serialization` branch is merged) the compiler will no longer insert `align_to_byte()` calls and will instead rely on the runtime libraries to do it: kaitai-io/kaitai_struct#1070 (comment)). Besides, we need to require something newer than 0.9 to fix the version check in Python: kaitai-io/kaitai_struct#804 (comment)
The code generated by ksc to check the version of the Python runtime is quite slow. To measure this, I compiled the spec
common/bytes_with_io.ksy
, which parses the entire input as a single byte array, so there's practically no overhead from type definitions. The current ksc 0.9 snapshot (ce18dc1, kaitai-io/kaitai_struct_compiler@f725faf) generates the following Python code:With this code, running
python3 -c 'import bytes_with_io'
takes about 260 ms (on average, on my machine). For comparison, if I remove the version check and related import:I get times of around 40 ms. Starting Python and running nothing takes 38 ms, which means that the KS code (runtime and generated code) only takes 2 ms, and
import pkg_resources
and the version check takes 220 ms! (Practically all of this time comes fromimport pkg_resources
, not from the actual version check code.) This is a noticeable delay, especially for small scripts that don't process a lot of data, which often take less than 200 ms to execute (without the version check). So I experimented with some possible solutions for reducing the time needed for the version check:ParsingJust noticed that this isn't actually an option for us. Even if we make the next version of__version__
usingpackaging.version.Version
. The behavior is identical topkg_resources.parse_version
, becauseparse_version
actually usespackaging.version
internally. The disadvantage is that it requires a dependency on thepackaging
library. (pkg_resources
has its own internal copy ofpackaging
, so even ifpkg_resources
is available,import packaging
might not work.) This version check takes about 20 ms (the entire command takes 60 ms).kaitaistruct
depend onpackaging
, older versions won't have that dependency, sopackaging
might not be installed when using an older runtime and the check wouldn't work.__version__
by hand into a tuple of ints and comparing that (something liketuple(int(part) for part in kaitaistruct.__version__.split(".")) < (0, 9)
). This requires no external dependencies, but can only handle simple version strings like0.9.1
- something like0.9.1.dev
, which is allowed in Python version strings, won't work. This version check has no noticeable performance impact (the entire command takes 40 ms).kaitaistruct.__version_info__
attribute that stores the version number as a tuple of ints, which can be compared. This is similar to the previous solution, except that it doesn't require any parsing in the generated code, and doesn't need to handle non-numeric version parts (those parts can be left out of__version_info__
). The disadvantage is that it requires a change to thekaitaistruct
runtime. This is not a big issue though - as long as__version__
stays, older version checks will continue to work, and new version checks need to handle old runtimes that don't have__version_info__
(something likegetattr(kaitaistruct, "__version_info__", (0, 8)) < (0, 9)
). This version check has no noticeable performance impact (the entire command takes 40 ms).I personally prefer the last solution, because it's the simplest, has the least runtime overhead, and doesn't require any external dependencies. But if anyone has arguments for the other solutions, or a better idea that I haven't thought of, let me know.
(based on these Gitter messages by myself: https://gitter.im/kaitai_struct/Lobby?at=5f372b993e6ff00c28939f29)
The text was updated successfully, but these errors were encountered: