convert from print() to logging #233

anarcat · 2015-10-01T17:47:09Z

the logging level used varies: most is logging.info(), in some place
logging.warning() or logging.error() are used when the condition is
clearly an error or warning. in other cases, we keep using print, but
force writing to sys.stderr, unless we interact with the user.

there were 77 calls to print before this commit, now there are 7, most
of which in the archiver module, which interacts directly with the
user. in one case there, we still use print() only because logging is
not setup properly yet during argument parsing.

it could be argued that commands like info or list should use print
directly, but we have converted them anyways, without ill effects on
the unit tests

unit tests still use print() in some places

this switches all informational output to stderr, which should help
with, if not fix jborg/attic#312 directly

note that this does not add support for syslog or logfiles yet,
although that would be fairly trivial to add now.

i think that verbose should be an integer and that adding more -v flags would increase debugging information available. but this keeps the current behavior, which we can change easily later, if we choose to start using logging.debug() calls.

this will help when we want to refactor output functions this function should definitely be replaced by a __repr__() or __str__() however

the logging level varies: most is logging.info(), in some place logging.warning() or logging.error() are used when the condition is clearly an error or warning. in other cases, we keep using print, but force writing to sys.stderr, unless we interact with the user. there were 77 calls to print before this commit, now there are 7, most of which in the archiver module, which interacts directly with the user. in one case there, we still use print() only because logging is not setup properly yet during argument parsing. it could be argued that commands like info or list should use print directly, but we have converted them anyways, without ill effects on the unit tests unit tests still use print() in some places this switches all informational output to stderr, which should help with, if not fix jborg/attic#312 directly

RonnyPfannschmidt · 2015-10-01T17:49:19Z

borg/archive.py

@@ -650,7 +652,7 @@ def init_chunks(self):
    def report_progress(self, msg, error=False):
        if error:
            self.error_found = True
-        print(msg, file=sys.stderr if error else sys.stdout)


this should use a logger and use log.error/log.warn

isn't that exactly what that patch does?

codecov-io · 2015-10-01T18:00:05Z

Current coverage is `83.36%`

Merging #233 into master will increase coverage by +0.23% as of dbc183f

@@            master    #233   diff @@
======================================
  Files           31      33     +2
  Stmts         6748    6829    +81
  Branches         0       0       
  Methods          0       0       
======================================
+ Hit           5610    5693    +83
  Partial          0       0       
+ Missed        1138    1136     -2

Review entire Coverage Diff as of dbc183f

Uncovered Suggestions

Powered by Codecov. Updated on successful CI builds.

RonnyPfannschmidt · 2015-10-01T18:00:13Z

borg/helpers.py

@@ -143,13 +144,14 @@ def update(self, size, csize, unique):

    def print_(self, label, cache):
        total_size, total_csize, unique_size, unique_csize, total_unique_chunks, total_chunks = cache.chunks.summarize()


it seems like at that place there should be newstyle string formating and a namedtuple return value

i didn't change that code, i'm trying to keep from doing unrelated changes here.

Yes, we should do that later.

RonnyPfannschmidt · 2015-10-01T18:01:35Z

blindly using the root logger is problematic, instead each module should have own loggers to give control over the granularity

anarcat · 2015-10-01T18:07:17Z

i am still unclear what to do about --stats, it could be that the argument is redundant and we should just show stats if --verbose. right now, i have kept them shown...

also, the test coverage suffers a bit from the patch, not sure how to fix that...

anarcat · 2015-10-01T18:07:43Z

blindly using the root logger is problematic, instead each module should have own loggers to give control over the granularity

well, it's a first pass... of course each "module" could use its own logger, but really, do we have modules?

we use logging.warning in info and list, but print() more usage bits. we also now support logging.debug() and by default are more silent

ThomasWaldmann · 2015-10-01T18:20:54Z

@anarcat modules like python modules. it is useful if one want rather lots of debug logging, but not from every module, but just some specific one which is being currently debugged.

anarcat · 2015-10-01T18:30:35Z

true. i'm not sure how to architecture that, suggestions? (by "that" i mean per module loggers)

ThomasWaldmann · 2015-10-02T08:58:46Z

About abuse of log levels:

steering verbosity ONLY via log level (in the sense of ERROR, WARNING, INFO, DEBUG) seems wrong, especially if it needs hacking the log level in a unnatural way. logging.warning should be only used if it is actually a warning that you're logging, not to make something visible that might not be visible because you reduced output by setting the minimum level to warning.

So I propose the following:

Use logging.error ONLY for errors, logging.warning ONLY for warning, logging.debug ONLY for debugging.

Informative output should be logged via logging.info.

This change should be implemented within this pull request.

Additionally, we could later introduce one of the alternatives:

a) There could be an additional "verbosity level" that controls how much informative output one gets (which could be controlled via verbose being an integer created either from -v0..9 or -v, -vv, -vvv - whatever is easiest with the argument parser).

b) There could be additional "output options" that control which kinds of informative outputs are emitted.
E.g. whether to show stats or not, whether to show this or that or not.

anarcat · 2015-10-02T13:48:45Z

so one of the things i'm trying to do here is to silence borg by default. that means that logging.info() is not shown by default. the "rule of silence". that means that borg info doesn't work by default. that doesn't make sense.

so i agree with your proposal, but i would extend that things that need to talk to the user by default (help, info, list, prompts) should use print(file=sys.stderr) instead of logging. this is in accordance with the logging documentation as well.

as for levels, i think that having three verbosity (silent, verbose, really verbose) makes sense and is a common occurance in software programs. i would like to stick with that. that would map to WARNING, INFO and DEBUG levels.

as for output options, maybe using logging modules could help discriminating what output we see? don't know.

ThomasWaldmann · 2015-10-02T14:06:32Z

As borg info is just the user interactively requesting to see some information on the terminal, I agree it should be print to sys.stderr.

I had the same kind of decision problems print vs. logging at quite some places.

About verbosity levels: defining what appears on what level needs a lot of thinking and agreement (and not every user might agree with the set of things we show to appear on some level). Thus, defining named features, we can still define some predefined feature sets that will be the official sets for silent, terse, verbose. One could also give a custom set via cmdline then to get whatever one wants.

anarcat · 2015-10-02T14:22:03Z

i think we're in violent agreement. :)

anarcat · 2015-10-02T14:35:01Z

so i looked again at the logging modules stuff, and it seems fairly trivial to log to different loggers based on the module name. i don't know what you guys had in mind, but it seems this could be sufficient, from:

import logging

logging.info('hello world')

we would instead do:

import logging
logger = logging.getLogger(__name__)

logger.info('hello world')

it's a slightly longer import, but then we can have different logging levels for different parts easily with an INI-style configuration file, or some Python dict object, that can be generated from anything we want. I would advise against creating a complex hierarchy of commandline options for this: there is a neat config file format that is well documented, let's use that.

also: the current format is basically "%(message)s" so it doesn't show the log level or the module name. I would add the module name, but i would leave the log level out: it's really annoying to have stuff like:

INFO: hello world!
INFO: hello again!

principle of silence again...

ThomasWaldmann · 2015-10-02T14:59:54Z

@anarcat per module logger is good to have, but not a replacement for logging feature sets.

I am using per module logging in moin since long, but it is not flexible enough.

And at least for borgweb, there must be the log level in the output. And I think it is generally useful, some users misinterpreted attic/borgs output in the past because it was not clear it was just a warning, not a (fatal) error.

that way we have one logger per module, and we can pick and choose which module we want verbose, for example

the heuristics i used are the following: 1. if we are prompting the use, use print on stderr (input() may produce some stuff on stdout, but it's outside the scope of this patch). we do not want those prompts to end up on the standard output in case we are piping stuff around 2. if the command is primarily producing output for the user on the console (`list`, `info`, `help`), we simply print on the default file descriptor. 3. everywhere else, we use the logging module with varying levels of verbosity, as appropriate.

we should find other ways of forcing that to be shown, it seems

anarcat · 2015-10-02T15:20:04Z

i've reverted parts of the print() to logging.warning() conversion. for stats, the heuristic i use is that when the command is primarly aimed at producing output interactively (info, list), we use print, on stdout. when the output is auxiliary (e.g. prune, create, etc), i use log.info() because we may want those to end up in logs instead of just on the console.

in prompts, we use print on stderr, just because i assume that we may be piping stuff and i don't want to clutter the output (a problem we don't have with list and info).

i have also pushed per-module loggers, which will give use crude, but at least some control on the output, per module.

it doesn't resolve the case of --stats. i am not sure how to deal with this: true, per modules controler are not enough for this, but what's the alternative?

maybe we are conflating two issues here... the way --stats is output can be treated separately from the logging conversion. in other words, we could keep throwing --stats on stderr and simply not log those at all. or we could check if the logging module will output to the console and avoid dupes...

not sure how to deal with that.

anarcat · 2015-10-02T15:21:19Z

note that the output changes with the new formatter, maybe that's not desired. before:

$ borg init /tmp/borg-test
Initializing repository at "/tmp/borg-test"
Encryption NOT enabled.
Use the "--encryption=repokey|keyfile|passphrase" to enable encryption.
Synchronizing chunks cache...
Archives: 0, w/ cached Idx: 0, w/ outdated Idx: 0, w/o cached Idx: 0.
Done.

after:

$ borg init -v /tmp/borg-test3
borg.archiver: Initializing repository at "/tmp/borg-test3"
borg.key: Encryption NOT enabled.
Use the "--encryption=repokey|keyfile|passphrase" to enable encryption.
borg.cache: Synchronizing chunks cache...
borg.cache: Archives: 0, w/ cached Idx: 0, w/ outdated Idx: 0, w/o cached Idx: 0.
borg.cache: Done.

yuck. i think i'll just revert this. :)

too much clutter

ThomasWaldmann · 2015-10-02T15:51:17Z

maybe printing the module name is not needed.

anarcat · 2015-10-02T17:51:44Z

for --stats, i think that we could have topical flags that force console output. things like:

--progress - to output a progress bar (eventually?)
--changed - to output the file listing what -v currently outputs
--unchanged - to also include unchanged files :p (see do not show unchanged files #236 for that specific bikeshed)
--stats - to show stats in the end

we could have a standard way of handling that behavior. for example:

logging.info(stats)
if args.stats and not magically_figure_out_if_logging_wrote_to_stderr():
    print(stats, file=sys.stderr)

that is: print to the console if requested, but always log so that logfiles have the relevant info, and avoid duplicates. obviously, the above method is incomplete, as i have no idea how to avoid those duplicates.

one way to accomplish this would be through logging filters that allow censoring certain content. if we send only objects to the logger, the filter will see objects and can decide whether we need to censor it or not, depending on args or whatever. that would mean that each object shoved in there would need to have a proper __str__() representation, but that is something that seems necessary anyways.

then those --stats, --changed, and so on flags only become filters. this could look something like:

def map_args_to_cls(args):
    cls = []
    if args.stats:
        cls += [ThatsStatsClassName]
    # [...] maybe have some way of hooking in here?
    return cls

def filtertemplate(args):
    cls = map_args_to_cls(args)
    def filter(record):
        return type(record.msg) is in cls:
    return filter

def setup_logging(args, ...):
    # [...]
    # python 3.2+ specific, before it expects a class, not a single function
    stderr_stream_handler.addFilter(filtertemplate(args))
    # [...]

what do you think? notice how we filter only stderr, which allows us to log everything anyways.

this greatly simplifies the display of those objects, as the __format__() parameter allows for arbitrary display of the internal fields of both objects this will allow us to display those summaries without having to pass a label to the string representation. we can also print the objects directly without formatting at all.

ThomasWaldmann · 2015-10-08T20:40:20Z

borg/logger.py

+    except AttributeError:
+        # somehow we failed to find our module
+        # return the logger module name by default
+        return __name__


that fn is unusual, why do you need it?

the idea was to standardise logger creation somewhere. the function is necessary to create per-modules loggers, otherwise:

def create_logger(): return logger.getLogger(__name__)

will always return a logger with the name 'logger' (because that's where the function is defined).

maybe overengineered.

ThomasWaldmann · 2015-10-08T20:44:39Z

borg/testsuite/logger.py

+    assert io_logger.getvalue() == "borg.testsuite.logger: hello world 2\n"
+
+def test_parent_module():
+    assert find_parent_module() == __name__


i think this module is not necessary. we can expect logging stdlib module has own tests / has been tested.
we only need to test our stuff, not whether stdlib works.

this was necessary to test that the setup wasn't running twice and therefore adding mutliple stream handlers. i had seen such cases during development and was trying to catch it.

we also test that the module detection code is actually working - i had cases where find_parent_module() would work yet logging.info() would use a global logger instead of the module-specific one (i was naively doing a logger = logging.getLogger('') in logger.py)...

our default verbosity shows only warnings, we'd have to tweak tests to be verbose for this to work This reverts commit 27be46a.

this was a remnant of when i was writing the converter/upgrader code, and was destined to be a general progress message in the migration process. i removed a more technical, internal debugging message in exchange

Conflicts: borg/archive.py borg/archiver.py borg/cache.py borg/key.py

ThomasWaldmann · 2015-10-09T17:08:18Z

borg/logger.py

+
+    we use it to avoid errors and provide a more standard API.
+    """
+    return logging.getLogger(name or find_parent_module())


this seems a bit too little to rectify an own function + that exotic support code.

in each module, it could be just:

import logging logger = logging.getLogger(__name__)

well i don't know. that's how i originally did it, of course. then i switched in anarcat@42cc17c

the idea was that, as you mentionned, the logging module can be tricky to setup. we haven't had that case yet, but it could be that in some cases we'd want to make sure the logging is setup before creating loggers. having our own functions could enforce that, for example.

ThomasWaldmann · 2015-10-10T10:11:10Z

i suggest to either merge this to master or create a branch in borg repo so we can collaboratively work there. thanks for your effort!

anarcat · 2015-10-13T03:17:33Z

i have pushed to logging-refactor in the borg repo for now. i'd like to shove more testing into this before merging it myself (this will happen soon!).

ThomasWaldmann · 2015-10-13T08:26:06Z

ok, fine. :)

anarcat · 2015-10-17T03:59:33Z

after further testing, i am satisfied with this, so I am merging it on master because a bunch of my other work depends on it.

we can improve on it later on, indeed, my verbosity branch does just that.

convert from print() to logging

anarcat added 2 commits October 1, 2015 12:39

just call print() once in the odd print_() function

c996fd8

this will help when we want to refactor output functions this function should definitely be replaced by a __repr__() or __str__() however

anarcat mentioned this pull request Oct 1, 2015

use the logging module #74

Closed

RonnyPfannschmidt reviewed Oct 1, 2015
View reviewed changes

convert more print() statements to logging

1b540d9

we use logging.warning in info and list, but print() more usage bits. we also now support logging.debug() and by default are more silent

anarcat added 4 commits October 2, 2015 11:05

use a module-specific logger instead of global one

c9b1131

that way we have one logger per module, and we can pick and choose which module we want verbose, for example

log level tweaking: stats should not be a warning

1819ada

we should find other ways of forcing that to be shown, it seems

remove stray comment

2f78b8a

don't add module name on standard messages

b09cdb4

too much clutter

ThomasWaldmann reviewed Oct 8, 2015
View reviewed changes

anarcat added 2 commits October 8, 2015 16:44

remove status that is best reflected in VCS

b1eafe7

fix typo

8075a13

ThomasWaldmann reviewed Oct 8, 2015
View reviewed changes

anarcat added 5 commits October 8, 2015 16:50

fix all logger pep8 warnings except long lines

a42a5e3

style: wrap multiline strings elegantly

3f68399

fix tests: they expect check to spew output

efa88ef

our default verbosity shows only warnings, we'd have to tweak tests to be verbose for this to work This reverts commit 27be46a.

move progress information out of tests and back in converter

fd41408

this was a remnant of when i was writing the converter/upgrader code, and was destined to be a general progress message in the migration process. i removed a more technical, internal debugging message in exchange

Merge remote-tracking branch 'origin/master' into logging-refactor

bdbdbdd

Conflicts: borg/archive.py borg/archiver.py borg/cache.py borg/key.py

ThomasWaldmann reviewed Oct 9, 2015
View reviewed changes

drop support for Python 2

dbc183f

anarcat self-assigned this Oct 9, 2015

anarcat added a commit that referenced this pull request Oct 17, 2015

Merge pull request #233 from anarcat/logging-refactor

128ff40

convert from print() to logging

anarcat merged commit 128ff40 into borgbackup:master Oct 17, 2015

This was referenced Oct 17, 2015

Verbosity fixes #288

Merged

Verbosity #303

Merged

logging / output: implement topical flags set #302

Closed

anarcat referenced this pull request Oct 20, 2015

update CHANGES.rst

344a25c

anarcat mentioned this pull request Oct 21, 2015

better progress indication for borg check #195

Closed

anarcat deleted the logging-refactor branch October 21, 2015 13:34

anarcat mentioned this pull request Nov 17, 2015

implement --log-level #426

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert from print() to logging #233

convert from print() to logging #233

anarcat commented Oct 1, 2015

RonnyPfannschmidt Oct 1, 2015

anarcat Oct 1, 2015

codecov-io commented Oct 1, 2015

RonnyPfannschmidt Oct 1, 2015

anarcat Oct 1, 2015

ThomasWaldmann Oct 2, 2015

RonnyPfannschmidt commented Oct 1, 2015

anarcat commented Oct 1, 2015

anarcat commented Oct 1, 2015

ThomasWaldmann commented Oct 1, 2015

anarcat commented Oct 1, 2015

ThomasWaldmann commented Oct 2, 2015

anarcat commented Oct 2, 2015

ThomasWaldmann commented Oct 2, 2015

anarcat commented Oct 2, 2015

anarcat commented Oct 2, 2015

ThomasWaldmann commented Oct 2, 2015

anarcat commented Oct 2, 2015

anarcat commented Oct 2, 2015

ThomasWaldmann commented Oct 2, 2015

anarcat commented Oct 2, 2015

ThomasWaldmann Oct 8, 2015

anarcat Oct 8, 2015

ThomasWaldmann Oct 8, 2015

anarcat Oct 8, 2015

ThomasWaldmann Oct 9, 2015

anarcat Oct 9, 2015

ThomasWaldmann commented Oct 10, 2015

anarcat commented Oct 13, 2015

ThomasWaldmann commented Oct 13, 2015

anarcat commented Oct 17, 2015

		@@ -143,13 +144,14 @@ def update(self, size, csize, unique):

		def print_(self, label, cache):
		total_size, total_csize, unique_size, unique_csize, total_unique_chunks, total_chunks = cache.chunks.summarize()

convert from print() to logging #233

convert from print() to logging #233

Conversation

anarcat commented Oct 1, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Oct 1, 2015

Current coverage is 83.36%

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RonnyPfannschmidt commented Oct 1, 2015

anarcat commented Oct 1, 2015

anarcat commented Oct 1, 2015

ThomasWaldmann commented Oct 1, 2015

anarcat commented Oct 1, 2015

ThomasWaldmann commented Oct 2, 2015

anarcat commented Oct 2, 2015

ThomasWaldmann commented Oct 2, 2015

anarcat commented Oct 2, 2015

anarcat commented Oct 2, 2015

ThomasWaldmann commented Oct 2, 2015

anarcat commented Oct 2, 2015

anarcat commented Oct 2, 2015

ThomasWaldmann commented Oct 2, 2015

anarcat commented Oct 2, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasWaldmann commented Oct 10, 2015

anarcat commented Oct 13, 2015

ThomasWaldmann commented Oct 13, 2015

anarcat commented Oct 17, 2015

Current coverage is `83.36%`