sync #20

spatil6 · 2021-02-03T15:02:36Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.

* feat(wandb): log artifacts * fix: typo * feat(wandb): ensure name is allowed * feat(wandb): log artifact * feat(wandb): saving logic * style: improve formatting * fix: unrelated typo * feat: use a fake trainer * fix: simplify * feat(wandb): log model files as artifact * style: fix style * docs(wandb): correct description * feat: unpack model + allow env Truethy values * feat: TrainerCallback can access tokenizer * style: fix style * feat(wandb): log more interesting metadata * feat: unpack tokenizer * feat(wandb): metadata with load_best_model_at_end * feat(wandb): more robust metadata * style(wandb): fix formatting

* Fix longformer * Apply style * Remove serving content * Forgot a condition * Apply style * Address Patrick's comments * Fix dtype

@LysandreJik

This PR proposes to: * auto-flush `transformers` logging When using logging for tracing signals from different parts of the code and which could be mixed with print debug this aids to get all the logging events synchronized. I don't think this change will introduce any performance impacts. If it helps someone here is the code I used to sync `transformers` logging with various other debug prints. I was porting bart to MP and I needed to trace that the device switching happens correctly and I added a bunch of logger.info calls inside `modeling_bart.py` and also had some other helpers `print` debug messages which weren't logger based: ``` # auto flush std streams from sys import stdout, stderr def stdout_write_flush(args, w=stderr.write): w(args); stderr.flush() def stderr_write_flush(args, w=stderr.write): w(args); stderr.flush() stdout.write = stdout_write_flush stderr.write = stderr_write_flush from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig import logging import transformers.utils.logging import transformers.models.bart.modeling_bart # I wanted a shorter simpler format handlers = transformers.utils.logging._get_library_root_logger().handlers for handler in handlers: formatter = logging.Formatter("[%(funcName)s] %(message)s") handler.setFormatter(formatter) transformers.models.bart.modeling_bart.logger.setLevel(transformers.logging.INFO) ``` @LysandreJik, @sgugger, @patrickvonplaten

…9347) * --model_parallel hasn't been implemented for most models * make the help clear as well * implement is_parallelizable; use it * oops * remove property

* Fix Funnel * Apply Patrick's comment * Remove comment * Fix dummy value * Apply style

* Use extlinks to point hyperlink with the version of code * Point to version on release and master until then * Apply style * Correct links * Add missing backtick * Simple missing backtick after all. Co-authored-by: Raghavendra Sugeeth P S <[email protected]> Co-authored-by: Lysandre <[email protected]>

* create model * add integration * save current state * make integration tests pass * add one more test * add explanation to tests * remove from bart * add padding * remove unnecessary test * make all tests pass * re-add cookie cutter tests * finish PyTorch * fix attention test * Update tests/test_modeling_common.py * revert change * remove unused file * add string to doc * save intermediate * make tf integration tests pass * finish tf * fix doc * fix docs again * add led to doctree * add to auto tokenizer * added tips for led * make style * apply jplus statements * correct tf longformer * apply lysandres suggestions * apply sylvains suggestions * Apply suggestions from code review

…of a regression task (#9411)

@sgugger

* [t5 doc] typos a few run away backticks @sgugger * style * [trainer] put fp16 args together this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read @sgugger * style

* first try * remove old template * finish bart * finish mbart * delete unnecessary line * init pegasus * save intermediate * correct pegasus * finish pegasus * remove cookie cutter leftover * add marian * finish blenderbot * replace in file * correctly split blenderbot * delete "old" folder * correct "add statement" * adapt config for tf comp * correct configs for tf * remove ipdb * fix more stuff * fix mbart * push pegasus fix * fix mbart * more fixes * fix research projects code * finish docs for bart, mbart, and marian * delete unnecessary file * correct attn typo * correct configs * remove pegasus for seq class * correct peg docs * correct peg docs * finish configs * further improve docs * add copied from statements to mbart * fix copied from in mbart * add copy statements to marian * add copied from to marian * add pegasus copied from * finish pegasus * finish copied from * Apply suggestions from code review * make style * backward comp blenderbot * apply lysandres and sylvains suggestions * apply suggestions * push last fixes * fix docs * fix tok tests * fix imports code style * fix doc

* outline sharded dpp doc * fix link * add example * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * narrow the command and remove non-essentials Co-authored-by: Sylvain Gugger <[email protected]>

* Splitting pipelines into its own module. * Moving everything into base.py * Moving FeatureExtractionPipeline into its own file. * TextGenerationPipeline. * TextClassifictionPipeline * ZeroShot + get_framework import. * FillMaskPipeline * NerPipeline + TokenClassificationPipeline * QuestionAnsweringPipeline * TableQuestionAnsweringPipeline * ConversationnalPipeline * Text2TextGenerationPipeline, TranslationPipeline, SummarizationPipeline * Typo import fix. * Relative imports.

Co-authored-by: Lysandre Debut <[email protected]>

* Allow example to use a revision and work with private models * Copy to other examples and template * Styling

* model wrapped + model_unwrap * cleanup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * style * deprecation warning * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>

* Add missing lines before a new list. * Update doc styler and restyle some files. * Fix docstrings of LED and Longformer

* first commit * changed XLMTokenizer to HerbertTokenizer in code example

* first commit * change phobert to phoBERT as per author in overview * v3 and v4 both runs on same code hence there is no need to differentiate them Co-authored-by: Sylvain Gugger <[email protected]>

* Store transformers version info when saving the model * Store transformers version info when saving the model * fix format * fix format * fix format * Update src/transformers/configuration_utils.py Co-authored-by: Lysandre Debut <[email protected]> * Update configuration_utils.py Co-authored-by: Lysandre Debut <[email protected]>

…in GenerationMixin (#9150) * Define new output dataclasses for greedy generation * Add output_[...] flags in greedy generation methods Added output_attentions, output_hidden_states, output_scores flags in generate and greedy_search methods in GenerationMixin. * [WIP] Implement logic and tests for output flags in generation * Update GreedySearchOutput classes & docstring * Implement greedy search output accumulation logic Update greedy_search unittests Fix generate method return value docstring Properly init flags with the default config * Update configuration to add output_scores flag * Fix test_generation_utils Sort imports and fix isinstance tests for GreedySearchOutputs * Fix typo in generation_utils * Add return_dict_in_generate for backwards compatibility * Add return_dict_in_generate flag in config * Fix tyPo in configuration * Fix handling of attentions and hidden_states flags * Make style & quality * first attempt attentions * some corrections * improve tests * special models requires special test * disable xlm test for now * clean tests * fix for tf * isort * Add output dataclasses for other generation methods * Add logic to return dict in sample generation * Complete test for sample generation - Pass output_attentions and output_hidden_states flags to encoder in encoder-decoder models - Fix import satements order in test_generation_utils file * Add logic to return dict in sample generation - Refactor tests to avoid using self.assertTrue, which provides scarce information when the test fails - Add tests for the three beam_search methods: vanilla, sample and grouped * Style doc * Fix copy-paste error in generation tests * Rename logits to scores and refactor * Refactor group_beam_search for consistency * make style * add sequences_scores * fix all tests * add docs * fix beam search finalize test * correct docstring * clean some files * Made suggested changes to the documentation * Style doc ? * Style doc using the Python util * Update src/transformers/generation_utils.py * fix empty lines * fix all test Co-authored-by: Patrick von Platen <[email protected]>

* Don't import libs to check they are available * Don't import integrations at init * Add importlib_metdata to deps * Remove old vars references * Avoid syntax error * Adapt testing utils * Try to appease torchhub * Add dependency * Remove more private variables * Fix typo * Another typo * Refine the tf availability test

* fix generation models * fix led * fix docs * add is_decoder * fix last docstrings * make style * fix t5 cross attentions * correct t5

* Clarify definition of seed argument in Trainer * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/training_args_tf.py Co-authored-by: Sylvain Gugger <[email protected]> * Fix style * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>

* TFBart lables consider both pad token and -100 * make style * fix for all other models Co-authored-by: kykim <kykim> Co-authored-by: patrickvonplaten <[email protected]>

* Add {decoder_,}head_mask to fsmt_modeling.py * Enable test_headmasking and some changes to docs * Remove test_head_masking flag from fsmt test file Remove test_head_masking flag from test_modeling_fsmt.py since test_head_masking is set to be True by default (thus it is redundant to store). * Merge master and remove test_head_masking = True * Rebase necessary due to an update of jaxlib * Remove test_head_masking=True in tests/test_modeling_fsmt.py as it is redundant.

@sgugger

* [t5 doc] typos a few run away backticks @sgugger * style * [trainer] put fp16 args together this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read @sgugger * style * [wandb] make WANDB_DISABLED disable wandb with any value This PR solves part of #9623 It tries to actually do what #9699 requested/discussed and that is any value of `WANDB_DISABLED` should disable wandb. The current behavior is that it has to be one of `ENV_VARS_TRUE_VALUES = {"1", "ON", "YES"}` I have been using `WANDB_DISABLED=true` everywhere in scripts as it was originally advertised. I have no idea why this was changed to a sub-set of possible values. And it's not documented anywhere. @sgugger * WANDB_DISABLED=true to disable; make tf trainer consistent * style

* MOD: fit chinese wwm to new datasets * MOD: move wwm to new folder * MOD: formate code * Styling * MOD add param and recover trainer Co-authored-by: Sylvain Gugger <[email protected]>

* Remove subclass for sortish sampler * Use old Seq2SeqTrainer in script * Styling

This affects Adafactor with relative_step=False and scale_parameter=True. Updating group["lr"] makes the result of ._get_lr() depends on the previous call, i.e., on the scale of other parameters. This isn't supposed to happen.

* add new model logic * fix docs * change structure * improve add_new_model * push new changes * up * up * correct spelling * improve docstring * correct line length * update readme * correct links * correct typos * only add rst file for now * Apply suggestions from code review 1 Co-authored-by: Stas Bekman <[email protected]> Co-authored-by: Bram Vanroy <[email protected]> * Apply suggestions from code review Co-authored-by: Bram Vanroy <[email protected]> Co-authored-by: Stas Bekman <[email protected]> * Apply suggestions from code review Co-authored-by: Stas Bekman <[email protected]> * Apply suggestions from code review Co-authored-by: Stas Bekman <[email protected]> Co-authored-by: Stefan Schweter <[email protected]> Co-authored-by: Bram Vanroy <[email protected]> * Apply suggestions from code review Co-authored-by: Stas Bekman <[email protected]> Co-authored-by: Pierric Cistac <[email protected]> * finish adding all suggestions * make style * apply Niels feedback * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * apply sylvains suggestions Co-authored-by: Stas Bekman <[email protected]> Co-authored-by: Bram Vanroy <[email protected]> Co-authored-by: Stefan Schweter <[email protected]> Co-authored-by: Pierric Cistac <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>

* fix conversion script * typo * import nn

* Change documentation to correctly specify loss tensor size * Change documentation to correct input format for labels * Corrected output size of loss tensor for sequence classifier, multiple choice model and question answering

* change tokenizer requirement * split line * Correct typo from list to str * improve style * make other function pretty as well * add comment * correct typo * add new test * pass tests for tok without padding token * Apply suggestions from code review

* ALBERT Tokenizer integration test * Batching * Style

* Initial work * Fix doc styler and other models

* add raw scaffold * implement feat extract layers * make style * remove + * correctly convert weights * make feat extractor work * make feature extraction proj work * run forward pass * finish forward pass * Succesful decoding example * remove unused files * more changes * add wav2vec tokenizer * add new structure * fix run forward * add other layer norm architecture * finish 2nd structure * add model tests * finish tests for tok and model * clean-up * make style * finish docstring for model and config * make style * correct docstring * correct tests * change checkpoints to fairseq * fix examples * finish wav2vec2 * make style * apply sylvains suggestions * apply lysandres suggestions * change print to log.info * re-add assert statement * add input_values as required input name * finish wav2vec2 tokenizer * Update tests/test_tokenization_wav2vec2.py Co-authored-by: Lysandre Debut <[email protected]> * apply sylvains suggestions Co-authored-by: Lysandre Debut <[email protected]>

* Add {decoder_,}head_mask to LED * Fix create_custom_forward signatue in encoder * Add head_mask to longformer * Add head_mask to longformer to fix dependencies of LED on Longformer. * Not working yet * Add mising one input in longofrmer_modeling.py * make fix-copies

Looks like a vulnerability and it's not really used anywhere in the code, so just as well remove it completely from deps. https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/bleach/open

* Fix Longformer and LED * Add a test for graph execution with inputs_embeds * Apply style

* fix steps_in_epoch variable when using max_steps * redundant sentence * Revert "redundant sentence" This reverts commit ad5c0e9. * remove redundant sentence Co-authored-by: wujindou <[email protected]>

…odel (#9976) * TF Albert integration test * TF Alber integration test added

* TF DistilBERT integration test * Update test_modeling_tf_distilbert.py

borisdayma and others added 30 commits January 5, 2021 03:30

Fix TF Longformer (#9348)

83eec97

* Fix longformer * Apply style * Remove serving content * Forgot a condition * Apply style * Address Patrick's comments * Fix dtype

Use stable functions (#9369)

4225740

[trainer] --model_parallel hasn't been implemented for most models (#…

748006c

…9347) * --model_parallel hasn't been implemented for most models * make the help clear as well * implement is_parallelizable; use it * oops * remove property

Fix TF Funnel (#9300)

52d62e6

* Fix Funnel * Apply Patrick's comment * Remove comment * Fix dummy value * Apply style

[examples/text-classification] Fix a bug for using one's own dataset …

57a6626

…of a regression task (#9411)

[trainer] group fp16 args together (#9409)

29acabd

* [t5 doc] typos a few run away backticks @sgugger * style * [trainer] put fp16 args together this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read @sgugger * style

add experimental warning (#9412)

d9e848c

improve readme text to private models/versioning/api (#9424)

4eec5d0

[docs] outline sharded ddp doc (#9208)

d64372f

* outline sharded dpp doc * fix link * add example * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * narrow the command and remove non-essentials Co-authored-by: Sylvain Gugger <[email protected]>

Fix link to Evaluate TAPAS Notebook (#9414)

c9553c0

Fix link to Notebook to fine-tune TAPAS (#9413)

7988edc

Co-authored-by: Lysandre Debut <[email protected]>

Allow example to use a revision and work with private models (#9407)

453a70d

* Allow example to use a revision and work with private models * Copy to other examples and template * Styling

Fix URLs to TAPAS notebooks (#9435)

b7e5489

Upgrade styler to better handle lists (#9423)

bcb55d3

* Add missing lines before a new list. * Update doc styler and restyle some files. * Fix docstrings of LED and Longformer

finalize (#9431)

b972c1b

Improve documentation coverage for Herbert (#9428)

be89899

* first commit * changed XLMTokenizer to HerbertTokenizer in code example

Improve documentation coverage for Phobert (#9427)

ecfcac2

* first commit * change phobert to phoBERT as per author in overview * v3 and v4 both runs on same code hence there is no need to differentiate them Co-authored-by: Sylvain Gugger <[email protected]>

[GenerationOutputs] Fix GenerationOutputs Tests (#9443)

b8462b5

* fix generation models * fix led * fix docs * add is_decoder * fix last docstrings * make style * fix t5 cross attentions * correct t5

Remove nested lxmert (#9440)

3ec4029

a more reliable version of branching point discovery (#9449)

28d7487

lewtun and others added 29 commits January 31, 2021 11:09

TFBart lables consider both pad token and -100 (#9847)

74f16b8

* TFBart lables consider both pad token and -100 * make style * fix for all other models Co-authored-by: kykim <kykim> Co-authored-by: patrickvonplaten <[email protected]>

Doc title in the template (#9910)

d85691a

fix logger format for non-main process (#9911)

6bab836

Fit chinese wwm to new datasets (#9887)

1682804

* MOD: fit chinese wwm to new datasets * MOD: move wwm to new folder * MOD: formate code * Styling * MOD add param and recover trainer Co-authored-by: Sylvain Gugger <[email protected]>

Remove subclass for sortish sampler (#9907)

115d97d

* Remove subclass for sortish sampler * Use old Seq2SeqTrainer in script * Styling

Adafactor: avoid updating group["lr"] attributes (#9751)

8672bcd

This affects Adafactor with relative_step=False and scale_parameter=True. Updating group["lr"] makes the result of ._get_lr() depends on the previous call, i.e., on the scale of other parameters. This isn't supposed to happen.

fix typos (#9924)

0842c33

Fix bart conversion script (#9923)

343057e

* fix conversion script * typo * import nn

Tensorflow doc changes on loss output size (#9922)

d1b14c9

* Change documentation to correctly specify loss tensor size * Change documentation to correct input format for labels * Corrected output size of loss tensor for sequence classifier, multiple choice model and question answering

fix typo in naming (#9944)

0f4dc5d

ALBERT Tokenizer integration test (#9943)

1809de5

* ALBERT Tokenizer integration test * Batching * Style

Fix 9918 (#9932)

de38a6e

* Initial work * Fix doc styler and other models

Bump numpy (#9934)

6202445

convbert: minor fixes for conversion script (#9937)

aa438a4

Use compute_loss in prediction_step (#9935)

d996024

Fix GroupedLinearLayer in TF ConvBERT (#9972)

a1a67a3

Fix Longformer and LED (#9942)

3f77c26

* Fix Longformer and LED * Add a test for graph execution with inputs_embeds * Apply style

fix steps_in_epoch variable in trainer when using max_steps (#9969)

5442a11

* fix steps_in_epoch variable when using max_steps * redundant sentence * Revert "redundant sentence" This reverts commit ad5c0e9. * remove redundant sentence Co-authored-by: wujindou <[email protected]>

[run_clm.py] fix getting extention

bca0dd5

Added integration tests for TensorFlow implementation of the ALBERT m…

f2d5c04

…odel (#9976) * TF Albert integration test * TF Alber integration test added

TF DistilBERT integration tests (#9975)

1486205

* TF DistilBERT integration test * Update test_modeling_tf_distilbert.py

spatil6 merged commit 1b7b7b3 into spatil6:mpnet_tf_test Feb 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync #20

sync #20

spatil6 commented Feb 3, 2021