Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement own BERT score #473

Merged
merged 46 commits into from
Aug 30, 2021
Merged

Implement own BERT score #473

merged 46 commits into from
Aug 30, 2021

Conversation

stancld
Copy link
Contributor

@stancld stancld commented Aug 20, 2021

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?

What does this PR do?

This PR implements the torchmetrics' own BERTScore's and aims to remove the dependency on the original bert-score package (see https://github.com/Tiiiger/bert_score). Also, it aims to leave the transformers package as an optional one.

Fixes #432 resolves #472

Current state:: All important features should be implemented. Though it is a lot of code so I expect some changes are to be required after reviews :)

PR review

Anyone in the community is free to review the PR once the tests have passed. Thank you for all the feedback in advance O:)

Did you have fun?

Make sure you had fun coding 🙃

@stancld stancld marked this pull request as draft August 20, 2021 22:05
@codecov
Copy link

codecov bot commented Aug 20, 2021

Codecov Report

Merging #473 (f23451e) into master (e6ad813) will decrease coverage by 1%.
The diff coverage is 85%.

@@          Coverage Diff           @@
##           master   #473    +/-   ##
======================================
- Coverage      96%    95%    -1%     
======================================
  Files         130    130            
  Lines        4357   4585   +228     
======================================
+ Hits         4174   4365   +191     
- Misses        183    220    +37     

stancld and others added 10 commits August 21, 2021 14:23
* Fix IDF rescalins

* Add new tests for IDF

* Add explicit dtypes when used torch.cat to make this work with older PT versions
* Add a support for the DDP mode and add the corresponding test

* Add a suport for verbose using tqdm loader

* Add some missing docs

* Update CHANGELOG.md
* Add support for the use's own model

* Add the corresponding example
@stancld
Copy link
Contributor Author

stancld commented Aug 27, 2021

@Borda There's an error with DDP on Windows (please see an exception below). I'm not familiar with Windows, i.e. is this something Windows-specific?

E       Exception: 
E       
E       -- Process 0 terminated with the following error:
E       Traceback (most recent call last):
E         File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\site-packages\torch\multiprocessing\spawn.py", line 19, in _wrap
E           fn(i, *args)
E         File "D:\a\metrics\metrics\tests\text\test_bertscore.py", line 323, in _test_score_ddp_fn
E           _bert_score_ddp(rank, world_size, preds, refs, original_score)
E         File "D:\a\metrics\metrics\tests\text\test_bertscore.py", line 308, in _bert_score_ddp
E           dist.init_process_group("gloo", rank=rank, world_size=world_size)
E       AttributeError: module 'torch.distributed' has no attribute 'init_process_group'

C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\site-packages\torch\multiprocessing\spawn.py:118: Exception

@Borda
Copy link
Member

Borda commented Aug 27, 2021

There's an error with DDP on Windows (please see an exception below). I'm not familiar with Windows, i.e. is this something Windows-specific?

hehe I do not think that DDP is properly working on Windows... @awaelchli?
maybe skip the test for Windows? @PyTorchLightning/core-metrics

@awaelchli
Copy link
Contributor

torch.distributed has support for windows from PyTorch 1.8
https://pytorch.org/docs/stable/distributed.html

@stancld
Copy link
Contributor Author

stancld commented Aug 27, 2021

@awaelchli @Borda Thanks a lot, I'll add a condition there :)

Copy link
Contributor

@SeanNaren SeanNaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun metric, thanks for the hard work!

@mergify mergify bot added the ready label Aug 27, 2021
@Borda Borda enabled auto-merge (squash) August 27, 2021 12:47
@mergify mergify bot removed the ready label Aug 27, 2021
@Borda
Copy link
Member

Borda commented Aug 27, 2021

The is one last particular config/test failing... mind have a look?
them we are almost merged 🐰

@stancld
Copy link
Contributor Author

stancld commented Aug 27, 2021

The is one last particular config/test failing... mind have a look?
them we are almost merged 🐰

Yeah, it seems to be an OOM issue during the DDP test. I'll try a smaller model for testing.

auto-merge was automatically disabled August 27, 2021 15:17

Head branch was pushed to by a user without write access

@stancld
Copy link
Contributor Author

stancld commented Aug 27, 2021

@Borda looks like a smaller model solves the issue :)

@mergify mergify bot added ready and removed ready labels Aug 27, 2021
@Borda
Copy link
Member

Borda commented Aug 29, 2021

@stancld seems now taking way more time for all the latest configurations... so it is failing on timeout

@stancld
Copy link
Contributor Author

stancld commented Aug 29, 2021

@Borda it seems like some processes must hang out when running multiprocessing on MacOS outside the if __name__ == "__main__" clause. Looks like setting join=False (not perform a blocking join on all processes - pytorch docs) can help to solve this issue.
However, one test configuration is still not executed within a time limit :/

@mergify mergify bot added the ready label Aug 29, 2021
@Borda
Copy link
Member

Borda commented Aug 30, 2021

I see, lets address this issue in another PR :]

@Borda Borda merged commit 7b09381 into Lightning-AI:master Aug 30, 2021
Borda pushed a commit that referenced this pull request Aug 30, 2021
* Start adding own BERTScore implementation
* Prepare the basic backbone for own BERTScore
* Make working BERTScore with all_layers=True + init bert_score in torchmetrics
* Fix cuda device placing
* Use IDF only if asked
* Add data collators
* Remove old parts of code
* Fix IDF rescaling and add new tests + fix type
* Add new tests for IDF
* Add explicit dtypes when used torch.cat to make this work with older PT versions
* Adjust code to work with the DDP plus some changes
* Add a support for the DDP mode and add the corresponding test
* Add a suport for verbose using tqdm loader
* Fix a bug with tokenizer and add hash_code
* Fix transformers import
* Add support for the user's own model
* Add the corresponding example
* Fix error raised by default tokenizer
* Add support for the rescale with baseline
* Clean some code + add some docstirngs
* Run bert-ddp tests only if torch.distributed.is_available()
* Use smaller model, 'albert-base-v2', for testing because of OOM issues
* Set join=False for mp_spawn in the ddp_test

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
(cherry picked from commit 7b09381)

# Conflicts:
#	torchmetrics/text/bert.py
#	torchmetrics/utilities/imports.py
Borda pushed a commit that referenced this pull request Aug 30, 2021
* Start adding own BERTScore implementation
* Prepare the basic backbone for own BERTScore
* Make working BERTScore with all_layers=True + init bert_score in torchmetrics
* Fix cuda device placing
* Use IDF only if asked
* Add data collators
* Remove old parts of code
* Fix IDF rescaling and add new tests + fix type
* Add new tests for IDF
* Add explicit dtypes when used torch.cat to make this work with older PT versions
* Adjust code to work with the DDP plus some changes
* Add a support for the DDP mode and add the corresponding test
* Add a suport for verbose using tqdm loader
* Fix a bug with tokenizer and add hash_code
* Fix transformers import
* Add support for the user's own model
* Add the corresponding example
* Fix error raised by default tokenizer
* Add support for the rescale with baseline
* Clean some code + add some docstirngs
* Run bert-ddp tests only if torch.distributed.is_available()
* Use smaller model, 'albert-base-v2', for testing because of OOM issues
* Set join=False for mp_spawn in the ddp_test

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

(cherry picked from commit 7b09381)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ready refactoring refactoring and code health topic: Text
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BERTScore not working in DPP implement own BERT score
4 participants