Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix documentation links always pointing to master. #9217

Merged
merged 9 commits into from
Jan 5, 2021
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions docs/source/benchmarks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ Benchmarks

Let's take a look at how 🤗 Transformer models can be benchmarked, best practices, and already available benchmarks.

A notebook explaining in more detail how to benchmark 🤗 Transformer models can be found `here
<https://github.com/huggingface/transformers/blob/master/notebooks/05-benchmark.ipynb>`__.
A notebook explaining in more detail how to benchmark 🤗 Transformer models can be found :prefix_link:`here
<notebooks/05-benchmark.ipynb>`.

How to benchmark 🤗 Transformer models
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -353,5 +353,5 @@ The approach is detailed in the `following blogpost
available `here
<https://docs.google.com/spreadsheets/d/1sryqufw2D0XlUH4sq3e9Wnxu5EAQkaohzrJbd5HdQ_w/edit?usp=sharing>`__.

With the new `benchmark` tools, it is easier than ever to share your benchmark results with the community `here
<https://github.com/huggingface/transformers/blob/master/examples/benchmarking/README.md>`__.
With the new `benchmark` tools, it is easier than ever to share your benchmark results with the community
:prefix_link:`here <examples/benchmarking/README.md>`.
6 changes: 3 additions & 3 deletions docs/source/bertology.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,6 @@ help people access the inner representations, mainly adapted from the great work
* retrieving heads output values and gradients to be able to compute head importance score and prune head as explained
in https://arxiv.org/abs/1905.10650.

To help you understand and use these features, we have added a specific example script: `bertology.py
<https://github.com/huggingface/transformers/blob/master/examples/research_projects/bertology/run_bertology.py>`_ while
extract information and prune a model pre-trained on GLUE.
To help you understand and use these features, we have added a specific example script: :prefix_link:`bertology.py
<examples/research_projects/bertology/run_bertology.py>` while extract information and prune a model pre-trained on
GLUE.
6 changes: 5 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,10 @@
version = u''
# The full version, including alpha/beta/rc tags
release = u'4.1.1'

# Prefix link to point to master, comment this during version release and uncomment below line
extlinks = {'prefix_link': ('https://github.com/huggingface/transformers/blob/master/%s', '')}
# Prefix link to always point to corresponding version, uncomment this during version release
# extlinks = {'prefix_link': ('https://github.com/huggingface/transformers/blob/v'+ release + '/%s', '')}

# -- General configuration ---------------------------------------------------

Expand All @@ -40,6 +43,7 @@
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.extlinks',
'sphinx.ext.coverage',
'sphinx.ext.napoleon',
'recommonmark',
Expand Down
10 changes: 4 additions & 6 deletions docs/source/converting_tensorflow_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,8 @@ BERT

You can convert any TensorFlow checkpoint for BERT (in particular `the pre-trained models released by Google
<https://github.com/google-research/bert#pre-trained-models>`_\ ) in a PyTorch save file by using the
`convert_bert_original_tf_checkpoint_to_pytorch.py
<https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_
script.
:prefix_link:`convert_bert_original_tf_checkpoint_to_pytorch.py
<src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>` script.

This CLI takes as input a TensorFlow checkpoint (three files starting with ``bert_model.ckpt``\ ) and the associated
configuration file (\ ``bert_config.json``\ ), and creates a PyTorch model for this configuration, loads the weights
Expand Down Expand Up @@ -66,9 +65,8 @@ ALBERT
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Convert TensorFlow model checkpoints of ALBERT to PyTorch using the
`convert_albert_original_tf_checkpoint_to_pytorch.py
<https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_
script.
:prefix_link:`convert_albert_original_tf_checkpoint_to_pytorch.py
<src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>` script.

The CLI takes as input a TensorFlow checkpoint (three files starting with ``model.ckpt-best``\ ) and the accompanying
configuration file (\ ``albert_config.json``\ ), then creates and saves a PyTorch model. To run this conversion you
Expand Down
4 changes: 2 additions & 2 deletions docs/source/main_classes/processors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -168,5 +168,5 @@ Using `tensorflow_datasets` is as easy as using a data file:
)


Another example using these processors is given in the `run_squad.py
<https://github.com/huggingface/transformers/blob/master/examples/question-answering/run_squad.py>`__ script.
Another example using these processors is given in the :prefix_link:`run_squad.py
<examples/question-answering/run_squad.py>` script.
2 changes: 1 addition & 1 deletion docs/source/model_doc/bart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Examples
_______________________________________________________________________________________________________________________

- Examples and scripts for fine-tuning BART and other models for sequence to sequence tasks can be found in
`examples/seq2seq/ <https://github.com/huggingface/transformers/blob/master/examples/seq2seq/README.md>`__.
:prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
- An example of how to train :class:`~transformers.BartForConditionalGeneration` with a Hugging Face :obj:`datasets`
object can be found in this `forum discussion
<https://discuss.huggingface.co/t/train-bart-for-conditional-generation-e-g-summarization/1904>`__.
Expand Down
4 changes: 2 additions & 2 deletions docs/source/model_doc/barthez.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ The Authors' code can be found `here <https://github.com/moussaKam/BARThez>`__.
Examples
_______________________________________________________________________________________________________________________

- BARThez can be fine-tuned on sequence-to-sequence tasks in a similar way as BART, check: `examples/seq2seq/
<https://github.com/huggingface/transformers/blob/master/examples/seq2seq/README.md>`__.
- BARThez can be fine-tuned on sequence-to-sequence tasks in a similar way as BART, check:
:prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.


BarthezTokenizer
Expand Down
9 changes: 3 additions & 6 deletions docs/source/model_doc/marian.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,9 @@ Examples

- Since Marian models are smaller than many other translation models available in the library, they can be useful for
fine-tuning experiments and integration tests.
- `Fine-tune on TPU
<https://github.com/huggingface/transformers/blob/master/examples/seq2seq/builtin_trainer/train_distil_marian_enro_tpu.sh>`__
- `Fine-tune on GPU
<https://github.com/huggingface/transformers/blob/master/examples/seq2seq/builtin_trainer/train_distil_marian_enro.sh>`__
- `Fine-tune on GPU with pytorch-lightning
<https://github.com/huggingface/transformers/blob/master/examples/seq2seq/distil_marian_no_teacher.sh>`__
- :prefix_link:`Fine-tune on TPU <examples/seq2seq/builtin_trainer/train_distil_marian_enro_tpu.sh>`__
- :prefix_link:`Fine-tune on GPU <examples/seq2seq/builtin_trainer/train_distil_marian_enro.sh>`__
- :prefix_link:`Fine-tune on GPU with pytorch-lightning <examples/seq2seq/distil_marian_no_teacher.sh>`__

Multilingual Models
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
2 changes: 1 addition & 1 deletion docs/source/model_doc/mbart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Examples
_______________________________________________________________________________________________________________________

- Examples and scripts for fine-tuning mBART and other models for sequence to sequence tasks can be found in
`examples/seq2seq/ <https://github.com/huggingface/transformers/blob/master/examples/seq2seq/README.md>`__.
:prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
- Given the large embeddings table, mBART consumes a large amount of GPU RAM, especially for fine-tuning.
:class:`MarianMTModel` is usually a better choice for bilingual machine translation.

Expand Down
5 changes: 2 additions & 3 deletions docs/source/model_doc/pegasus.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,8 @@ All the `checkpoints <https://huggingface.co/models?search=pegasus>`__ are fine-
Examples
_______________________________________________________________________________________________________________________

- `Script <https://github.com/huggingface/transformers/blob/master/examples/seq2seq/finetune_pegasus_xsum.sh>`__ to
fine-tune pegasus on the XSUM dataset. Data download instructions at `examples/seq2seq/
<https://github.com/huggingface/transformers/blob/master/examples/seq2seq/README.md>`__.
- :prefix_link:`Script <examples/seq2seq/finetune_pegasus_xsum.sh>` to fine-tune pegasus on the XSUM dataset. Data
download instructions at :prefix_link:`examples/seq2seq/ <examples/seq2seq/README.md>`.
- FP16 is not supported (help/ideas on this appreciated!).
- The adafactor optimizer is recommended for pegasus fine-tuning.

Expand Down
5 changes: 2 additions & 3 deletions docs/source/multilingual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,8 @@ You can then feed it all as input to your model:
>>> outputs = model(input_ids, langs=langs)


The example `run_generation.py
<https://github.com/huggingface/transformers/blob/master/examples/text-generation/run_generation.py>`__ can generate
text using the CLM checkpoints from XLM, using the language embeddings.
The example :prefix_link:`run_generation.py <examples/text-generation/run_generation.py>` can generate text using the
CLM checkpoints from XLM, using the language embeddings.

XLM without Language Embeddings
-----------------------------------------------------------------------------------------------------------------------
Expand Down
6 changes: 2 additions & 4 deletions docs/source/task_summary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -750,8 +750,7 @@ Summarization is the task of summarizing a document or an article into a shorter

An example of a summarization dataset is the CNN / Daily Mail dataset, which consists of long news articles and was
created for the task of summarization. If you would like to fine-tune a model on a summarization task, various
approaches are described in this `document
<https://github.com/huggingface/transformers/blob/master/examples/seq2seq/README.md>`__.
approaches are described in this :prefix_link:`document <examples/seq2seq/README.md>`.

Here is an example of using the pipelines to do summarization. It leverages a Bart model that was fine-tuned on the CNN
/ Daily Mail data set.
Expand Down Expand Up @@ -829,8 +828,7 @@ Translation is the task of translating a text from one language to another.

An example of a translation dataset is the WMT English to German dataset, which has sentences in English as the input
data and the corresponding sentences in German as the target data. If you would like to fine-tune a model on a
translation task, various approaches are described in this `document
<https://github.com/huggingface/transformers/blob/master/examples/seq2seq/README.md>`__.
translation task, various approaches are described in this :prefix_link:`document <examples/seq2seq/README.md>`.

Here is an example of using the pipelines to do translation. It leverages a T5 model that was only pre-trained on a
multi-task mixture dataset (including WMT), yet, yielding impressive translation results.
Expand Down
35 changes: 14 additions & 21 deletions docs/source/testing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,25 +25,22 @@ How transformers are tested
-----------------------------------------------------------------------------------------------------------------------

1. Once a PR is submitted it gets tested with 9 CircleCi jobs. Every new commit to that PR gets retested. These jobs
are defined in this `config file <https://github.com/huggingface/transformers/blob/master/.circleci/config.yml>`__,
so that if needed you can reproduce the same environment on your machine.
are defined in this :prefix_link:`config file <.circleci/config.yml>`, so that if needed you can reproduce the same
environment on your machine.

These CI jobs don't run ``@slow`` tests.

2. There are 3 jobs run by `github actions <https://github.com/huggingface/transformers/actions>`__:

* `torch hub integration
<https://github.com/huggingface/transformers/blob/master/.github/workflows/github-torch-hub.yml>`__: checks
whether torch hub integration works.
* :prefix_link:`torch hub integration <.github/workflows/github-torch-hub.yml>`: checks whether torch hub
integration works.

* `self-hosted (push) <https://github.com/huggingface/transformers/blob/master/.github/workflows/self-push.yml>`__:
runs fast tests on GPU only on commits on ``master``. It only runs if a commit on ``master`` has updated the code
in one of the following folders: ``src``, ``tests``, ``.github`` (to prevent running on added model cards,
notebooks, etc.)
* :prefix_link:`self-hosted (push) <.github/workflows/self-push.yml>`: runs fast tests on GPU only on commits on
``master``. It only runs if a commit on ``master`` has updated the code in one of the following folders: ``src``,
``tests``, ``.github`` (to prevent running on added model cards, notebooks, etc.)

* `self-hosted runner
<https://github.com/huggingface/transformers/blob/master/.github/workflows/self-scheduled.yml>`__: runs normal and
slow tests on GPU in ``tests`` and ``examples``:
* :prefix_link:`self-hosted runner <.github/workflows/self-scheduled.yml>`: runs normal and slow tests on GPU in
``tests`` and ``examples``:

.. code-block:: bash

Expand Down Expand Up @@ -492,12 +489,9 @@ spawns a normal process that then spawns off multiple workers and manages the IO

This is still under development but you can study 2 different tests that perform this successfully:

* `test_seq2seq_examples_multi_gpu.py
<https://github.com/huggingface/transformers/blob/master/examples/seq2seq/test_seq2seq_examples_multi_gpu.py>`__ - a
* :prefix_link:`test_seq2seq_examples_multi_gpu.py <examples/seq2seq/test_seq2seq_examples_multi_gpu.py>` - a
``pytorch-lightning``-running test (had to use PL's ``ddp`` spawning method which is the default)
* `test_finetune_trainer.py
<https://github.com/huggingface/transformers/blob/master/examples/seq2seq/test_finetune_trainer.py>`__ - a normal
(non-PL) test
* :prefix_link:`test_finetune_trainer.py <examples/seq2seq/test_finetune_trainer.py>` - a normal (non-PL) test

To jump right into the execution point, search for the ``execute_subprocess_async`` function in those tests.

Expand Down Expand Up @@ -940,10 +934,9 @@ slow models to do qualitative testing. To see the use of these simply look for *

grep tiny tests examples

Here is a an example of a `script
<https://github.com/huggingface/transformers/blob/master/scripts/fsmt/fsmt-make-tiny-model.py>`__ that created the tiny
model `stas/tiny-wmt19-en-de <https://huggingface.co/stas/tiny-wmt19-en-de>`__. You can easily adjust it to your
specific model's architecture.
Here is a an example of a :prefix_link:`script <scripts/fsmt/fsmt-make-tiny-model.py>` that created the tiny model
`stas/tiny-wmt19-en-de <https://huggingface.co/stas/tiny-wmt19-en-de>`__. You can easily adjust it to your specific
model's architecture.

It's easy to measure the run-time incorrectly if for example there is an overheard of downloading a huge model, but if
you test it locally the downloaded files would be cached and thus the download time not measured. Hence check the
Expand Down