-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-31415: Add -X importtime
option.
#3490
Conversation
It shows show import time of each module. It's useful for optimizing startup time.
f57fb1b
to
280d585
Compare
This should probably be added to the equivalent code in importlib as some other interpreters use that code as their direct import implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the idea! First review.
Doc/using/cmdline.rst
Outdated
@@ -407,6 +407,8 @@ Miscellaneous options | |||
* ``-X showalloccount`` to output the total count of allocated objects for | |||
each type when the program finishes. This only works when Python was built with | |||
``COUNT_ALLOCS`` defined. | |||
* ``-X importprofile`` to show how long each import takes. Nested imports are | |||
indented. It can be used to optimize application's startup time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, maybe rename the option to "-X importime"? I don't know.
Maybe explain that the displayed time is the cumulative time, including import time of sub-imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed importprofile to importtime.
And self time is shown now, as Serhiy's suggestion.
Python/import.c
Outdated
fprintf(stderr, "%*s- %s %ld [us]\n", | ||
import_level*2, "", | ||
PyUnicode_AsUTF8(abs_name), | ||
(long)(t2 - t1) / 1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use the ms unit if us >= 10 000? It would make the output easier to read, no?
I don't understand the "123 [us]" format. Why not "123 us"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vote for "123 us" too, or "[123 us]".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Square brackets is used to make it clear it's not variable (123 * u * s), but physical unit (123 in μs).
In English, it's rarely used, only when it can be confusing.
In Japan, some community use "use square bracket always for physical unit" habit.
And I studied in such college.
I just know it's local habit and abuse in general!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use the ms unit if us >= 10 000? It would make the output easier to read, no?
I prefer fixed unit. It makes post analysis (e.g. generate Flame Graph) easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with @methane. Mixed units are confusing when values are meant to be compared.
Python/import.c
Outdated
|
||
if (import_profile) { | ||
import_level--; | ||
t2 = _PyTime_GetMonotonicClock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_PyTime_t has no unit. You cannot use directly t2-t1.
I suggest you to use _PyTime_AsMicroseconds(t2 - t1, _PyTime_ROUND_CEILING). Moreover, it will round "correctly" ;-) (towards +inf, rather than towards zero).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Python/import.c
Outdated
if (import_profile) { | ||
import_level--; | ||
t2 = _PyTime_GetMonotonicClock(); | ||
fprintf(stderr, "%*s- %s %ld [us]\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your sample, the output looks nice. But imports can emit warnings, importlib can log extrat messages (try python3 -v), etc. It would help parsing to prefix messages. For example: "import_time: (...)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated format, and output example in pull request comment.
I don't know about importlib much. Do you mean Anyway, I prefer C implementation because I don't want to worry about performance overhead of this feature. |
Crazy idea: would it be possible to display also a timestamp related to Python startup time? Get monotonic time at startup, then compute the delta.
|
-X importprofile
option.-X importtime
option.
It is possible, but is it worth enough to increase output size? |
At least it would be interesting to know the impact of interpreter initialization excluding stdlib imports. |
bae19a6
to
c915ef0
Compare
Yeah, I expect that Python startup is made of many things, not only imports. I expect that timestamps would help to understand what happens. |
Travis failed because suspicious check. "Inline literal start-string without end-string."
|
OK. Then, adding I think Lines 791 to 793 in b1d1c42
How do you think adding private API like this?
|
After considering it about several minutes, I think it should be added as separated issue.
Or, perhaps, can we use DTrace or other existing tools for it? |
I modified your PR to play with it. I added start/complete logs, and IMHO the indented output is less surprising. Rather that hacking your PR, I created a new one, just to show my proposed format: PR #3765. Example:
Don't look at the exact implementation. I mostly created a PR to discuss the logs format. |
But who cares? Even if few developers (including me and you) cares about it, some logs other than imports should be |
I agree with @methane on the timestamps question: while I do think a "time since the timing infrastructure was initialised" could be a useful thing to consistently add to internal log messages, I don't think it makes sense to consider it as part of this PR. |
Oops, hit the wrong button - sorry. |
Some other formats. Replace bullet to
It seems too dense. "import time" and "imported" looks duplicated.
No
|
Le 01/10/2017 à 04:31, INADA Naoki a écrit :
No |in| or |took|:
|import time: collections.abc (self: 204 us, cumulative: 204 us)
I prefer this one, as it's easier to parse visually.
But if you're willing to spend a bit more time on this, you could
perhaps adopt a table format so that all times are vertically-aligned...
|
Table format:
|
My favourite so far is the one-line nested option:
My rationale for that preference is that it means a simple grep filter can select the top-level imports:
Reporting depth selection is then just a matter of varying the amount of whitespace that the regex allows. I also think that no matter what we do here, converting any single-line-per-module format into a proper display for humans is going to involve post-processing to allow the indirect imports to be reported after the top level import that initiated them. |
I like the table output with the header. |
I'll merge this PR in next day (12+ hours later), if other format gain more votes. |
It shows show import time of each module.
It's useful for optimizing startup time.
Sample (full version):
https://bugs.python.org/issue31415