Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugfix] Accumulated_gradient and TensoBoard #4738

Merged
merged 29 commits into from
Nov 25, 2020

Conversation

tchaton
Copy link
Contributor

@tchaton tchaton commented Nov 18, 2020

What does this PR do?

This PR tries to improve the logging display on TensorBoard when using accumulated_grad_batches > 1.
It also introduces log_epoch_metrics_on_step parameter within the trainer.

log_epoch_metrics_on_step idea was dropped !

Fixes #4304

log_epoch_metrics_on_step = True

Screenshot 2020-11-18 at 11 38 55

Screenshot 2020-11-18 at 11 39 03

log_epoch_metrics_on_step = False

Screenshot 2020-11-18 at 11 57 42

Screenshot 2020-11-18 at 11 57 58

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In in short, see following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified; Bugfixes should be including in bug-fix release milestones (m.f.X) and features should be included in (m.X.b) releases.

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented Nov 18, 2020

Hello @tchaton! Thanks for updating this PR.

Line 193:121: E501 line too long (122 > 120 characters)
Line 194:121: E501 line too long (134 > 120 characters)

Comment last updated at 2020-11-25 12:01:40 UTC

@tchaton tchaton self-assigned this Nov 18, 2020
@tchaton tchaton added bug Something isn't working logging Related to the `LoggerConnector` and `log()` labels Nov 18, 2020
@tchaton tchaton added this to the 1.1 milestone Nov 18, 2020
@codecov
Copy link

codecov bot commented Nov 18, 2020

Codecov Report

Merging #4738 (f5cb188) into master (d24a267) will increase coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #4738   +/-   ##
======================================
  Coverage      93%     93%           
======================================
  Files         118     118           
  Lines        9031    9033    +2     
======================================
+ Hits         8403    8405    +2     
  Misses        628     628           

@tchaton
Copy link
Contributor Author

tchaton commented Nov 19, 2020

@tchaton can we split this to 2 PRs?

Hey @edenlightning, I removed the parameters as you suggested. This PR contains only the fix for tensorboard logging in case of accumulated_gradient > 1

@tchaton tchaton modified the milestones: 1.1, 1.0.x Nov 20, 2020
@tchaton tchaton changed the title Accumulated_gradient and tensorboard [bugfix] Accumulated_gradient and TensoBoard Nov 20, 2020
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure about the APPI, and pls update docs

Copy link
Member

@SkafteNicki SkafteNicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tchaton
Copy link
Contributor Author

tchaton commented Nov 25, 2020

Hey @williamFalcon , can you review this one ?

@tchaton tchaton added the ready PRs ready to be merged label Nov 25, 2020
@tchaton tchaton merged commit 204a0a2 into master Nov 25, 2020
@Borda Borda deleted the bugfix/4304_tensorboard_accumulated_grad branch November 29, 2020 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working design Includes a design discussion logging Related to the `LoggerConnector` and `log()` ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TensorBoardLogger not working as expected with accumulate_grad_batches>1
8 participants