Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-tuning t5-base model raises an error #1661

Closed
krzysztoffiok opened this issue Jun 3, 2020 · 10 comments
Closed

Fine-tuning t5-base model raises an error #1661

krzysztoffiok opened this issue Jun 3, 2020 · 10 comments
Assignees
Labels
bug Something isn't working wontfix This will not be worked on

Comments

@krzysztoffiok
Copy link

krzysztoffiok commented Jun 3, 2020

Hi,

I tried to fine-tune T5-base model on google colab and get this error

ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds")

To be more specific where the error happens, it happens at the very moment when the training should start:

2020-06-03 12:07:25,877 ----------------------------------------------------------------------------------------------------
2020-06-03 12:07:25,877 Corpus: "Corpus: 4800 train + 1200 dev + 20630 test sentences"
2020-06-03 12:07:25,878 ----------------------------------------------------------------------------------------------------
2020-06-03 12:07:25,878 Parameters:
2020-06-03 12:07:25,878 - learning_rate: "3e-06"
2020-06-03 12:07:25,879 - mini_batch_size: "8"
2020-06-03 12:07:25,879 - patience: "3"
2020-06-03 12:07:25,879 - anneal_factor: "0.5"
2020-06-03 12:07:25,880 - max_epochs: "4"
2020-06-03 12:07:25,880 - shuffle: "True"
2020-06-03 12:07:25,880 - train_with_dev: "False"
2020-06-03 12:07:25,880 - batch_growth_annealing: "False"
2020-06-03 12:07:25,880 ----------------------------------------------------------------------------------------------------
2020-06-03 12:07:25,880 Model training base path: "semeval_data/model_sentiment_0"
2020-06-03 12:07:25,880 ----------------------------------------------------------------------------------------------------
2020-06-03 12:07:25,880 Device: cuda:0
2020-06-03 12:07:25,881 ----------------------------------------------------------------------------------------------------
2020-06-03 12:07:25,881 Embeddings storage mode: cpu
2020-06-03 12:07:25,883 ----------------------------------------------------------------------------------------------------
Traceback (most recent call last):
File "./model_train.py", line 138, in
shuffle=True,
File "/usr/local/lib/python3.6/dist-packages/flair/trainers/trainer.py", line 349, in train
loss = self.model.forward_loss(batch_step)
File "/usr/local/lib/python3.6/dist-packages/flair/models/text_classification_model.py", line 142, in forward_loss
scores = self.forward(data_points)
File "/usr/local/lib/python3.6/dist-packages/flair/models/text_classification_model.py", line 98, in forward
self.document_embeddings.embed(sentences)
File "/usr/local/lib/python3.6/dist-packages/flair/embeddings/base.py", line 59, in embed
self._add_embeddings_internal(sentences)
File "/usr/local/lib/python3.6/dist-packages/flair/embeddings/document.py", line 91, in _add_embeddings_internal
self._add_embeddings_to_sentences(batch)
File "/usr/local/lib/python3.6/dist-packages/flair/embeddings/document.py", line 136, in _add_embeddings_to_sentences
else self.model(input_ids)[-1]
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_t5.py", line 955, in forward
use_cache=use_cache,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_t5.py", line 674, in forward
raise ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds")
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

To Reproduce
Go to google colab, create a new project with gpu and do the following:
!git clone https://github.com/krzysztoffiok/twitter_sentiment
!pip3 install flair
!pip3 install datatable

cd twitter_sentiment

!python3 ./semeval_data_splitter.py
!python3 ./model_train.py --dataset=semeval --k_folds=5 --test_run=t5-base --fine_tune

Expected behavior
the script should start training (fine tuning) a list of models, the first given is t5-base

Environment (please complete the following information):
google colab GPU runtime

!nvidia-smi
Wed Jun 3 12:11:08 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 |
| N/A 36C P8 26W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

!pip3 freeze returns:

absl-py==0.9.0
alabaster==0.7.12
albumentations==0.1.12
altair==4.1.0
asgiref==3.2.7
astor==0.8.1
astropy==4.0.1.post1
astunparse==1.6.3
atari-py==0.2.6
atomicwrites==1.4.0
attrs==19.3.0
audioread==2.1.8
autograd==1.3
Babel==2.8.0
backcall==0.1.0
beautifulsoup4==4.6.3
bleach==3.1.5
blessed==1.17.6
blis==0.4.1
bokeh==1.4.0
boto==2.49.0
boto3==1.13.19
botocore==1.16.19
Bottleneck==1.3.2
bpemb==0.3.0
branca==0.4.1
bs4==0.0.1
CacheControl==0.12.6
cachetools==3.1.1
catalogue==1.0.0
certifi==2020.4.5.1
cffi==1.14.0
chainer==6.5.0
chardet==3.0.4
click==7.1.2
cloudpickle==1.3.0
cmake==3.12.0
cmdstanpy==0.4.0
colorama==0.4.3
colorlover==0.3.0
community==1.0.0b1
contextlib2==0.5.5
convertdate==2.2.1
coverage==3.7.1
coveralls==0.5
crcmod==1.7
cufflinks==0.17.3
cupy-cuda101==6.5.0
cvxopt==1.2.5
cvxpy==1.0.31
cycler==0.10.0
cymem==2.0.3
Cython==0.29.19
daft==0.0.4
dask==2.12.0
dataclasses==0.7
datascience==0.10.6
datatable==0.10.1
decorator==4.4.2
defusedxml==0.6.0
Deprecated==1.2.10
descartes==1.1.0
dill==0.3.1.1
distributed==1.25.3
Django==3.0.6
dlib==19.18.0
docopt==0.6.2
docutils==0.15.2
dopamine-rl==1.0.5
earthengine-api==0.1.223
easydict==1.9
ecos==2.0.7.post1
editdistance==0.5.3
en-core-web-sm==2.2.5
entrypoints==0.3
ephem==3.7.7.1
et-xmlfile==1.0.1
fa2==0.3.5
fancyimpute==0.4.3
fastai==1.0.61
fastdtw==0.3.4
fastprogress==0.2.3
fastrlock==0.4
fbprophet==0.6
feather-format==0.4.1
featuretools==0.4.1
filelock==3.0.12
firebase-admin==4.1.0
fix-yahoo-finance==0.0.22
flair==0.5
Flask==1.1.2
folium==0.8.3
fsspec==0.7.4
future==0.16.0
gast==0.3.3
GDAL==2.2.2
gdown==3.6.4
gensim==3.6.0
geographiclib==1.50
geopy==1.17.0
gin-config==0.3.0
glob2==0.7
google==2.0.3
google-api-core==1.16.0
google-api-python-client==1.7.12
google-auth==1.7.2
google-auth-httplib2==0.0.3
google-auth-oauthlib==0.4.1
google-cloud-bigquery==1.21.0
google-cloud-core==1.0.3
google-cloud-datastore==1.8.0
google-cloud-firestore==1.7.0
google-cloud-language==1.2.0
google-cloud-storage==1.18.1
google-cloud-translate==1.5.0
google-colab==1.0.0
google-pasta==0.2.0
google-resumable-media==0.4.1
googleapis-common-protos==1.51.0
googledrivedownloader==0.4
graphviz==0.10.1
grpcio==1.29.0
gspread==3.0.1
gspread-dataframe==3.0.7
gym==0.17.2
h5py==2.10.0
HeapDict==1.0.1
holidays==0.9.12
html5lib==1.0.1
httpimport==0.5.18
httplib2==0.17.4
httplib2shim==0.0.3
humanize==0.5.1
hyperopt==0.1.2
ideep4py==2.0.0.post3
idna==2.9
image==1.5.32
imageio==2.4.1
imagesize==1.2.0
imbalanced-learn==0.4.3
imblearn==0.0
imgaug==0.2.9
importlib-metadata==1.6.0
imutils==0.5.3
inflect==2.1.0
intel-openmp==2020.0.133
intervaltree==2.1.0
ipykernel==4.10.1
ipython==5.5.0
ipython-genutils==0.2.0
ipython-sql==0.3.9
ipywidgets==7.5.1
itsdangerous==1.1.0
jax==0.1.68
jaxlib==0.1.47
jdcal==1.4.1
jedi==0.17.0
jieba==0.42.1
Jinja2==2.11.2
jmespath==0.10.0
joblib==0.15.1
jpeg4py==0.1.4
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.3.4
jupyter-console==5.2.0
jupyter-core==4.6.3
kaggle==1.5.6
kapre==0.1.3.1
Keras==2.3.1
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.2
keras-vis==0.4.1
kiwisolver==1.2.0
knnimpute==0.1.0
langdetect==1.0.8
librosa==0.6.3
lightgbm==2.2.3
llvmlite==0.31.0
lmdb==0.98
lucid==0.3.8
LunarCalendar==0.0.9
lxml==4.2.6
Markdown==3.2.2
MarkupSafe==1.1.1
matplotlib==3.2.1
matplotlib-venn==0.11.5
missingno==0.4.2
mistune==0.8.4
mizani==0.6.0
mkl==2019.0
mlxtend==0.14.0
more-itertools==8.3.0
moviepy==0.2.3.5
mpld3==0.3
mpmath==1.1.0
msgpack==1.0.0
multiprocess==0.70.9
multitasking==0.0.9
murmurhash==1.0.2
music21==5.5.0
natsort==5.5.0
nbconvert==5.6.1
nbformat==5.0.6
networkx==2.4
nibabel==3.0.2
nltk==3.2.5
notebook==5.2.2
np-utils==0.5.12.1
numba==0.48.0
numexpr==2.7.1
numpy==1.18.4
nvidia-ml-py3==7.352.0
oauth2client==4.1.3
oauthlib==3.1.0
okgrade==0.4.3
opencv-contrib-python==4.1.2.30
opencv-python==4.1.2.30
openpyxl==2.5.9
opt-einsum==3.2.1
osqp==0.6.1
packaging==20.4
palettable==3.3.0
pandas==1.0.4
pandas-datareader==0.8.1
pandas-gbq==0.11.0
pandas-profiling==1.4.1
pandocfilters==1.4.2
parso==0.7.0
pathlib==1.0.1
patsy==0.5.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow==7.0.0
pip-tools==4.5.1
plac==1.1.3
plotly==4.4.1
plotnine==0.6.0
pluggy==0.13.1
portpicker==1.3.1
prefetch-generator==1.0.1
preshed==3.0.2
prettytable==0.7.2
progressbar2==3.38.0
prometheus-client==0.8.0
promise==2.3
prompt-toolkit==1.0.18
protobuf==3.10.0
psutil==5.4.8
psycopg2==2.7.6.1
ptvsd==5.0.0a12
ptyprocess==0.6.0
py==1.8.1
pyarrow==0.14.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycocotools==2.0.0
pycparser==2.20
pydata-google-auth==1.1.0
pydot==1.3.0
pydot-ng==2.0.0
pydotplus==2.0.2
PyDrive==1.3.1
pyemd==0.5.1
pyglet==1.5.0
Pygments==2.1.3
pygobject==3.26.1
pymc3==3.7
PyMeeus==0.3.7
pymongo==3.10.1
pymystem3==0.2.0
PyOpenGL==3.1.5
pyparsing==2.4.7
pyrsistent==0.16.0
pysndfile==1.3.8
PySocks==1.7.1
pystan==2.19.1.1
pytest==5.4.3
python-apt==1.6.5+ubuntu0.2
python-chess==0.23.11
python-dateutil==2.8.1
python-louvain==0.14
python-slugify==4.0.0
python-utils==2.4.0
pytz==2018.9
PyWavelets==1.1.1
PyYAML==3.13
pyzmq==19.0.1
qtconsole==4.7.4
QtPy==1.9.0
regex==2019.12.20
requests==2.23.0
requests-oauthlib==1.3.0
resampy==0.2.2
retrying==1.3.3
rpy2==3.2.7
rsa==4.0
s3fs==0.4.2
s3transfer==0.3.3
sacremoses==0.0.43
scikit-image==0.16.2
scikit-learn==0.22.2.post1
scipy==1.4.1
screen-resolution-extra==0.0.0
scs==2.1.2
seaborn==0.10.1
segtok==1.5.10
Send2Trash==1.5.0
sentencepiece==0.1.91
setuptools-git==1.2
Shapely==1.7.0
simplegeneric==0.8.1
six==1.12.0
sklearn==0.0
sklearn-pandas==1.8.0
smart-open==2.0.0
snowballstemmer==2.0.0
sortedcontainers==2.1.0
spacy==2.2.4
Sphinx==1.8.5
sphinxcontrib-websupport==1.2.2
SQLAlchemy==1.3.17
sqlitedict==1.6.0
sqlparse==0.3.1
srsly==1.0.2
statsmodels==0.10.2
sympy==1.1.1
tables==3.4.4
tabulate==0.8.7
tbb==2020.0.133
tblib==1.6.0
tensorboard==2.2.2
tensorboard-plugin-wit==1.6.0.post3
tensorboardcolab==0.0.22
tensorflow==2.2.0
tensorflow-addons==0.8.3
tensorflow-datasets==2.1.0
tensorflow-estimator==2.2.0
tensorflow-gcs-config==2.1.8
tensorflow-hub==0.8.0
tensorflow-metadata==0.22.1
tensorflow-privacy==0.2.2
tensorflow-probability==0.10.0
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
text-unidecode==1.3
textblob==0.15.3
textgenrnn==1.4.1
Theano==1.0.4
thinc==7.4.0
tifffile==2020.5.30
tokenizers==0.7.0
toolz==0.10.0
torch==1.5.0+cu101
torchsummary==1.5.1
torchtext==0.3.1
torchvision==0.6.0+cu101
tornado==4.5.3
tqdm==4.41.1
traitlets==4.3.3
transformers==2.11.0
tweepy==3.6.0
typeguard==2.7.1
typesentry==0.2.7
typing==3.6.6
typing-extensions==3.6.6
tzlocal==1.5.1
umap-learn==0.4.3
uritemplate==3.0.1
urllib3==1.24.3
vega-datasets==0.8.0
wasabi==0.6.0
wcwidth==0.1.9
webencodings==0.5.1
Werkzeug==1.0.1
widgetsnbextension==3.5.1
wordcloud==1.5.0
wrapt==1.12.1
xarray==0.15.1
xgboost==0.90
xkit==0.0.0
xlrd==1.1.0
xlwt==1.3.0
yellowbrick==0.9.1
zict==2.0.0
zipp==3.1.0

@krzysztoffiok krzysztoffiok added the bug Something isn't working label Jun 3, 2020
@nightlessbaron
Copy link

Did you solve the error? I am also facing the same bug

@stale
Copy link

stale bot commented Nov 29, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Nov 29, 2020
@stale stale bot closed this as completed Dec 6, 2020
@stefan-it
Copy link
Member

I have a working solution for it, will prepare a PR for that soon, so re-opening it!

@stefan-it stefan-it reopened this Apr 6, 2022
@stale stale bot removed the wontfix This will not be worked on label Apr 6, 2022
@ataniz
Copy link

ataniz commented Jul 18, 2022

I have a working solution for it, will prepare a PR for that soon, so re-opening it!

Hi @stefan-it , is there any updates on the PR? error still persists

@Madhu000
Copy link

Madhu000 commented Aug 7, 2022

I am facing the same issue for mt5-small one.. Can anyone fix this if yes please your guidance is always welcome.. Thanks in advance.

@stefan-it
Copy link
Member

Hi @ataniz and @Madhu000 ,

sorry for the late reply! I pushed a working version of encoder-only fine-tuning T5 models:

#2896

Feel free to test it 🤗

@stefan-it stefan-it self-assigned this Aug 8, 2022
@Madhu000
Copy link

Madhu000 commented Aug 8, 2022 via email

@stefan-it
Copy link
Member

Hi @Madhu000 ,

it seems that Flair in your virtual environment uses the installed 0.11 version (this can be seen in the logs, because flair/embeddings/base.py do not have a line 692 in latest master due to a recent refactoring). Here's a short snippet of how to use the T5 encoder fix branch:

pip3 uninstall flair

git clone https://github.com/flairNLP/flair.git
cd flair
git checkout add-t5-encoder-support
pip3 install -e .

Then you can try using it again :)

@Madhu000
Copy link

Madhu000 commented Aug 8, 2022 via email

@stale
Copy link

stale bot commented Dec 24, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Dec 24, 2022
@stale stale bot closed this as completed Jan 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

5 participants