-
-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert from print() to logging #233
Conversation
this will help when we want to refactor output functions this function should definitely be replaced by a __repr__() or __str__() however
the logging level varies: most is logging.info(), in some place logging.warning() or logging.error() are used when the condition is clearly an error or warning. in other cases, we keep using print, but force writing to sys.stderr, unless we interact with the user. there were 77 calls to print before this commit, now there are 7, most of which in the archiver module, which interacts directly with the user. in one case there, we still use print() only because logging is not setup properly yet during argument parsing. it could be argued that commands like info or list should use print directly, but we have converted them anyways, without ill effects on the unit tests unit tests still use print() in some places this switches all informational output to stderr, which should help with, if not fix jborg/attic#312 directly
@@ -650,7 +652,7 @@ def init_chunks(self): | |||
def report_progress(self, msg, error=False): | |||
if error: | |||
self.error_found = True | |||
print(msg, file=sys.stderr if error else sys.stdout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should use a logger and use log.error/log.warn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't that exactly what that patch does?
Current coverage is
|
@@ -143,13 +144,14 @@ def update(self, size, csize, unique): | |||
|
|||
def print_(self, label, cache): | |||
total_size, total_csize, unique_size, unique_csize, total_unique_chunks, total_chunks = cache.chunks.summarize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like at that place there should be newstyle string formating and a namedtuple return value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i didn't change that code, i'm trying to keep from doing unrelated changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should do that later.
blindly using the root logger is problematic, instead each module should have own loggers to give control over the granularity |
i am still unclear what to do about also, the test coverage suffers a bit from the patch, not sure how to fix that... |
well, it's a first pass... of course each "module" could use its own logger, but really, do we have modules? |
we use logging.warning in info and list, but print() more usage bits. we also now support logging.debug() and by default are more silent
@anarcat modules like python modules. it is useful if one want rather lots of debug logging, but not from every module, but just some specific one which is being currently debugged. |
true. i'm not sure how to architecture that, suggestions? (by "that" i mean per module loggers) |
About abuse of log levels: steering verbosity ONLY via log level (in the sense of ERROR, WARNING, INFO, DEBUG) seems wrong, especially if it needs hacking the log level in a unnatural way. logging.warning should be only used if it is actually a warning that you're logging, not to make something visible that might not be visible because you reduced output by setting the minimum level to warning. So I propose the following: Use logging.error ONLY for errors, logging.warning ONLY for warning, logging.debug ONLY for debugging. Informative output should be logged via logging.info. This change should be implemented within this pull request. Additionally, we could later introduce one of the alternatives: a) There could be an additional "verbosity level" that controls how much informative output one gets (which could be controlled via verbose being an integer created either from -v0..9 or -v, -vv, -vvv - whatever is easiest with the argument parser). b) There could be additional "output options" that control which kinds of informative outputs are emitted. |
so one of the things i'm trying to do here is to silence borg by default. that means that logging.info() is not shown by default. the "rule of silence". that means that so i agree with your proposal, but i would extend that things that need to talk to the user by default ( as for levels, i think that having three verbosity (silent, verbose, really verbose) makes sense and is a common occurance in software programs. i would like to stick with that. that would map to WARNING, INFO and DEBUG levels. as for output options, maybe using logging modules could help discriminating what output we see? don't know. |
As borg info is just the user interactively requesting to see some information on the terminal, I agree it should be print to sys.stderr. I had the same kind of decision problems print vs. logging at quite some places. About verbosity levels: defining what appears on what level needs a lot of thinking and agreement (and not every user might agree with the set of things we show to appear on some level). Thus, defining named features, we can still define some predefined feature sets that will be the official sets for silent, terse, verbose. One could also give a custom set via cmdline then to get whatever one wants. |
i think we're in violent agreement. :) |
so i looked again at the logging modules stuff, and it seems fairly trivial to log to different loggers based on the module name. i don't know what you guys had in mind, but it seems this could be sufficient, from:
we would instead do:
it's a slightly longer import, but then we can have different logging levels for different parts easily with an INI-style configuration file, or some Python dict object, that can be generated from anything we want. I would advise against creating a complex hierarchy of commandline options for this: there is a neat config file format that is well documented, let's use that. also: the current format is basically "%(message)s" so it doesn't show the log level or the module name. I would add the module name, but i would leave the log level out: it's really annoying to have stuff like:
principle of silence again... |
@anarcat per module logger is good to have, but not a replacement for logging feature sets. I am using per module logging in moin since long, but it is not flexible enough. And at least for borgweb, there must be the log level in the output. And I think it is generally useful, some users misinterpreted attic/borgs output in the past because it was not clear it was just a warning, not a (fatal) error. |
that way we have one logger per module, and we can pick and choose which module we want verbose, for example
the heuristics i used are the following: 1. if we are prompting the use, use print on stderr (input() may produce some stuff on stdout, but it's outside the scope of this patch). we do not want those prompts to end up on the standard output in case we are piping stuff around 2. if the command is primarily producing output for the user on the console (`list`, `info`, `help`), we simply print on the default file descriptor. 3. everywhere else, we use the logging module with varying levels of verbosity, as appropriate.
we should find other ways of forcing that to be shown, it seems
i've reverted parts of the print() to logging.warning() conversion. for stats, the heuristic i use is that when the command is primarly aimed at producing output interactively ( in prompts, we use print on stderr, just because i assume that we may be piping stuff and i don't want to clutter the output (a problem we don't have with i have also pushed per-module loggers, which will give use crude, but at least some control on the output, per module. it doesn't resolve the case of maybe we are conflating two issues here... the way not sure how to deal with that. |
note that the output changes with the new formatter, maybe that's not desired. before:
after:
yuck. i think i'll just revert this. :) |
too much clutter
maybe printing the module name is not needed. |
for
we could have a standard way of handling that behavior. for example: logging.info(stats)
if args.stats and not magically_figure_out_if_logging_wrote_to_stderr():
print(stats, file=sys.stderr) that is: print to the console if requested, but always log so that logfiles have the relevant info, and avoid duplicates. obviously, the above method is incomplete, as i have no idea how to avoid those duplicates. one way to accomplish this would be through logging filters that allow censoring certain content. if we send only objects to the logger, the filter will see objects and can decide whether we need to censor it or not, depending on then those def map_args_to_cls(args):
cls = []
if args.stats:
cls += [ThatsStatsClassName]
# [...] maybe have some way of hooking in here?
return cls
def filtertemplate(args):
cls = map_args_to_cls(args)
def filter(record):
return type(record.msg) is in cls:
return filter
def setup_logging(args, ...):
# [...]
# python 3.2+ specific, before it expects a class, not a single function
stderr_stream_handler.addFilter(filtertemplate(args))
# [...] what do you think? notice how we filter only stderr, which allows us to log everything anyways. |
this greatly simplifies the display of those objects, as the __format__() parameter allows for arbitrary display of the internal fields of both objects this will allow us to display those summaries without having to pass a label to the string representation. we can also print the objects directly without formatting at all.
except AttributeError: | ||
# somehow we failed to find our module | ||
# return the logger module name by default | ||
return __name__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that fn is unusual, why do you need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea was to standardise logger creation somewhere. the function is necessary to create per-modules loggers, otherwise:
def create_logger():
return logger.getLogger(__name__)
will always return a logger with the name 'logger' (because that's where the function is defined).
maybe overengineered.
assert io_logger.getvalue() == "borg.testsuite.logger: hello world 2\n" | ||
|
||
def test_parent_module(): | ||
assert find_parent_module() == __name__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this module is not necessary. we can expect logging stdlib module has own tests / has been tested.
we only need to test our stuff, not whether stdlib works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was necessary to test that the setup wasn't running twice and therefore adding mutliple stream handlers. i had seen such cases during development and was trying to catch it.
we also test that the module detection code is actually working - i had cases where find_parent_module()
would work yet logging.info()
would use a global logger instead of the module-specific one (i was naively doing a logger = logging.getLogger('')
in logger.py
)...
our default verbosity shows only warnings, we'd have to tweak tests to be verbose for this to work This reverts commit 27be46a.
this was a remnant of when i was writing the converter/upgrader code, and was destined to be a general progress message in the migration process. i removed a more technical, internal debugging message in exchange
Conflicts: borg/archive.py borg/archiver.py borg/cache.py borg/key.py
|
||
we use it to avoid errors and provide a more standard API. | ||
""" | ||
return logging.getLogger(name or find_parent_module()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems a bit too little to rectify an own function + that exotic support code.
in each module, it could be just:
import logging
logger = logging.getLogger(__name__)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well i don't know. that's how i originally did it, of course. then i switched in anarcat@42cc17c
the idea was that, as you mentionned, the logging module can be tricky to setup. we haven't had that case yet, but it could be that in some cases we'd want to make sure the logging is setup before creating loggers. having our own functions could enforce that, for example.
i suggest to either merge this to master or create a branch in borg repo so we can collaboratively work there. thanks for your effort! |
i have pushed to logging-refactor in the borg repo for now. i'd like to shove more testing into this before merging it myself (this will happen soon!). |
ok, fine. :) |
after further testing, i am satisfied with this, so I am merging it on master because a bunch of my other work depends on it. we can improve on it later on, indeed, my |
convert from print() to logging
the logging level used varies: most is logging.info(), in some place
logging.warning() or logging.error() are used when the condition is
clearly an error or warning. in other cases, we keep using print, but
force writing to sys.stderr, unless we interact with the user.
there were 77 calls to print before this commit, now there are 7, most
of which in the archiver module, which interacts directly with the
user. in one case there, we still use print() only because logging is
not setup properly yet during argument parsing.
it could be argued that commands like info or list should use print
directly, but we have converted them anyways, without ill effects on
the unit tests
unit tests still use print() in some places
this switches all informational output to stderr, which should help
with, if not fix jborg/attic#312 directly
note that this does not add support for syslog or logfiles yet,
although that would be fairly trivial to add now.
i think that verbose should be an integer and that adding more -v flags would increase debugging information available. but this keeps the current behavior, which we can change easily later, if we choose to start using
logging.debug()
calls.