Implement own BERT score #473

stancld · 2021-08-20T22:03:40Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?

What does this PR do?

This PR implements the torchmetrics' own BERTScore's and aims to remove the dependency on the original bert-score package (see https://github.com/Tiiiger/bert_score). Also, it aims to leave the transformers package as an optional one.

Fixes #432 resolves #472

Current state:: All important features should be implemented. Though it is a lot of code so I expect some changes are to be required after reviews :)

PR review

Anyone in the community is free to review the PR once the tests have passed. Thank you for all the feedback in advance O:)

Did you have fun?

Make sure you had fun coding 🙃

…n torchmetrics

* Update docs * Remove old parts of code * Update and write tests * Clean the code

for more information, see https://pre-commit.ci

codecov · 2021-08-20T22:07:21Z

Codecov Report

Merging #473 (f23451e) into master (e6ad813) will decrease coverage by 1%.
The diff coverage is 85%.

@@          Coverage Diff           @@
##           master   #473    +/-   ##
======================================
- Coverage      96%    95%    -1%     
======================================
  Files         130    130            
  Lines        4357   4585   +228     
======================================
+ Hits         4174   4365   +191     
- Misses        183    220    +37

* Fix IDF rescalins * Add new tests for IDF * Add explicit dtypes when used torch.cat to make this work with older PT versions

for more information, see https://pre-commit.ci

* Add a support for the DDP mode and add the corresponding test * Add a suport for verbose using tqdm loader * Add some missing docs * Update CHANGELOG.md

for more information, see https://pre-commit.ci

* Add support for the use's own model * Add the corresponding example

for more information, see https://pre-commit.ci

stancld · 2021-08-27T08:26:26Z

@Borda There's an error with DDP on Windows (please see an exception below). I'm not familiar with Windows, i.e. is this something Windows-specific?

E       Exception: 
E       
E       -- Process 0 terminated with the following error:
E       Traceback (most recent call last):
E         File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\site-packages\torch\multiprocessing\spawn.py", line 19, in _wrap
E           fn(i, *args)
E         File "D:\a\metrics\metrics\tests\text\test_bertscore.py", line 323, in _test_score_ddp_fn
E           _bert_score_ddp(rank, world_size, preds, refs, original_score)
E         File "D:\a\metrics\metrics\tests\text\test_bertscore.py", line 308, in _bert_score_ddp
E           dist.init_process_group("gloo", rank=rank, world_size=world_size)
E       AttributeError: module 'torch.distributed' has no attribute 'init_process_group'

C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\site-packages\torch\multiprocessing\spawn.py:118: Exception

Borda · 2021-08-27T08:43:20Z

There's an error with DDP on Windows (please see an exception below). I'm not familiar with Windows, i.e. is this something Windows-specific?

hehe I do not think that DDP is properly working on Windows... @awaelchli?
maybe skip the test for Windows? @PyTorchLightning/core-metrics

awaelchli · 2021-08-27T08:48:30Z

torch.distributed has support for windows from PyTorch 1.8
https://pytorch.org/docs/stable/distributed.html

stancld · 2021-08-27T09:17:16Z

@awaelchli @Borda Thanks a lot, I'll add a condition there :)

SeanNaren

Fun metric, thanks for the hard work!

Borda · 2021-08-27T14:26:53Z

The is one last particular config/test failing... mind have a look?
them we are almost merged 🐰

stancld · 2021-08-27T15:04:15Z

The is one last particular config/test failing... mind have a look?
them we are almost merged 🐰

Yeah, it seems to be an OOM issue during the DDP test. I'll try a smaller model for testing.

stancld · 2021-08-27T15:59:03Z

@Borda looks like a smaller model solves the issue :)

Borda · 2021-08-29T18:41:52Z

@stancld seems now taking way more time for all the latest configurations... so it is failing on timeout

stancld · 2021-08-29T23:19:57Z

@Borda it seems like some processes must hang out when running multiprocessing on MacOS outside the if __name__ == "__main__" clause. Looks like setting join=False (not perform a blocking join on all processes - pytorch docs) can help to solve this issue.
However, one test configuration is still not executed within a time limit :/

Borda · 2021-08-30T08:42:52Z

I see, lets address this issue in another PR :]

* Start adding own BERTScore implementation * Prepare the basic backbone for own BERTScore * Make working BERTScore with all_layers=True + init bert_score in torchmetrics * Fix cuda device placing * Use IDF only if asked * Add data collators * Remove old parts of code * Fix IDF rescaling and add new tests + fix type * Add new tests for IDF * Add explicit dtypes when used torch.cat to make this work with older PT versions * Adjust code to work with the DDP plus some changes * Add a support for the DDP mode and add the corresponding test * Add a suport for verbose using tqdm loader * Fix a bug with tokenizer and add hash_code * Fix transformers import * Add support for the user's own model * Add the corresponding example * Fix error raised by default tokenizer * Add support for the rescale with baseline * Clean some code + add some docstirngs * Run bert-ddp tests only if torch.distributed.is_available() * Use smaller model, 'albert-base-v2', for testing because of OOM issues * Set join=False for mp_spawn in the ddp_test Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <[email protected]> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> (cherry picked from commit 7b09381) # Conflicts: # torchmetrics/text/bert.py # torchmetrics/utilities/imports.py

* Start adding own BERTScore implementation * Prepare the basic backbone for own BERTScore * Make working BERTScore with all_layers=True + init bert_score in torchmetrics * Fix cuda device placing * Use IDF only if asked * Add data collators * Remove old parts of code * Fix IDF rescaling and add new tests + fix type * Add new tests for IDF * Add explicit dtypes when used torch.cat to make this work with older PT versions * Adjust code to work with the DDP plus some changes * Add a support for the DDP mode and add the corresponding test * Add a suport for verbose using tqdm loader * Fix a bug with tokenizer and add hash_code * Fix transformers import * Add support for the user's own model * Add the corresponding example * Fix error raised by default tokenizer * Add support for the rescale with baseline * Clean some code + add some docstirngs * Run bert-ddp tests only if torch.distributed.is_available() * Use smaller model, 'albert-base-v2', for testing because of OOM issues * Set join=False for mp_spawn in the ddp_test Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <[email protected]> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> (cherry picked from commit 7b09381)

stancld added 8 commits August 18, 2021 22:54

[WIP] Start adding own BERTScore implementation

f72661a

[WIP] Prepare the basic backbone for own BERTScore

2620b36

[WIP] Make working BERTScore with all_layers=True + init bert_score i…

a79b091

…n torchmetrics

Fix cuda device placing

b145470

Use IDF only if asked

f1c66b1

Add data collators

3b741cb

Add some docs and clean code

6a65526

Update docs, write new tests, clean the code

9a1b275

* Update docs * Remove old parts of code * Update and write tests * Clean the code

stancld requested review from ananyahjha93, Borda, ethanwharris, justusschock, SeanNaren, SkafteNicki and tchaton as code owners August 20, 2021 22:03

stancld closed this Aug 20, 2021

stancld reopened this Aug 20, 2021

[pre-commit.ci] auto fixes from pre-commit.com hooks

97f6796

for more information, see https://pre-commit.ci

stancld marked this pull request as draft August 20, 2021 22:05

stancld and others added 10 commits August 21, 2021 14:23

Fix IDF rescaling and add new tests + fix type

f544c70

* Fix IDF rescalins * Add new tests for IDF * Add explicit dtypes when used torch.cat to make this work with older PT versions

[pre-commit.ci] auto fixes from pre-commit.com hooks

aec66b1

for more information, see https://pre-commit.ci

Adjust code to work with the DDP plus some changes

4cd7ea3

* Add a support for the DDP mode and add the corresponding test * Add a suport for verbose using tqdm loader * Add some missing docs * Update CHANGELOG.md

[pre-commit.ci] auto fixes from pre-commit.com hooks

10a63a6

for more information, see https://pre-commit.ci

Fix a bug with tokenizer and add hash_code

8f32421

Fix transformers import

8c68e01

[pre-commit.ci] auto fixes from pre-commit.com hooks

474baf7

for more information, see https://pre-commit.ci

Fix some mypy, flake8 and logic errors

89b49d8

Add support for the user's own model

d97153e

* Add support for the use's own model * Add the corresponding example

[pre-commit.ci] auto fixes from pre-commit.com hooks

31abcf0

for more information, see https://pre-commit.ci

stancld added 2 commits August 27, 2021 11:17

Merge branch 'master' into own_bert-score

4e5c1da

Run bert-ddp tests only if torch.distributed.is_available()

d0d1419

SeanNaren approved these changes Aug 27, 2021

View reviewed changes

Simplify a condition

9759298

mergify bot added the ready label Aug 27, 2021

Merge branch 'master' into own_bert-score

a8dda94

Borda enabled auto-merge (squash) August 27, 2021 12:47

mergify bot removed the ready label Aug 27, 2021

Use smaller model, 'albert-base-v2', for testing because of OOM issues

64e0863

auto-merge was automatically disabled August 27, 2021 15:17
Head branch was pushed to by a user without write access

mergify bot added ready and removed ready labels Aug 27, 2021

Set join=False for mp_spawn in the ddp_test

f23451e

mergify bot added the ready label Aug 29, 2021

Borda merged commit 7b09381 into Lightning-AI:master Aug 30, 2021

Borda added the topic: Text label Aug 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement own BERT score #473

Implement own BERT score #473

stancld commented Aug 20, 2021 •

edited by Borda

Loading

codecov bot commented Aug 20, 2021 •

edited

Loading

stancld commented Aug 27, 2021

Borda commented Aug 27, 2021

awaelchli commented Aug 27, 2021

stancld commented Aug 27, 2021

SeanNaren left a comment

Borda commented Aug 27, 2021

stancld commented Aug 27, 2021

stancld commented Aug 27, 2021

Borda commented Aug 29, 2021 •

edited

Loading

stancld commented Aug 29, 2021 •

edited

Loading

Borda commented Aug 30, 2021

Implement own BERT score #473

Implement own BERT score #473

Conversation

stancld commented Aug 20, 2021 • edited by Borda Loading

Before submitting

What does this PR do?

PR review

Did you have fun?

codecov bot commented Aug 20, 2021 • edited Loading

Codecov Report

stancld commented Aug 27, 2021

Borda commented Aug 27, 2021

awaelchli commented Aug 27, 2021

stancld commented Aug 27, 2021

SeanNaren left a comment

Choose a reason for hiding this comment

Borda commented Aug 27, 2021

stancld commented Aug 27, 2021

stancld commented Aug 27, 2021

Borda commented Aug 29, 2021 • edited Loading

stancld commented Aug 29, 2021 • edited Loading

Borda commented Aug 30, 2021

stancld commented Aug 20, 2021 •

edited by Borda

Loading

codecov bot commented Aug 20, 2021 •

edited

Loading

Borda commented Aug 29, 2021 •

edited

Loading

stancld commented Aug 29, 2021 •

edited

Loading