Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 3.5 broke the multi context/questions feature for the QuestionAnsweringPipeline #8759

Closed
2 tasks done
Mathieu4141 opened this issue Nov 24, 2020 · 1 comment · Fixed by #8765
Closed
2 tasks done
Assignees

Comments

@Mathieu4141
Copy link

Environment info

  • transformers version: 3.5.1 (also in 3.5.0)
  • Platform: Darwin-20.1.0-x86_64-i386-64bit
  • Python version: 3.7.5
  • PyTorch version (GPU?): 1.7.0 (False)
  • Tensorflow version (GPU?): 2.3.1 (False)
  • Using GPU in script?: no
  • Using distributed or parallel set-up in script?: no

Who can help

tokenizers: @mfuntowicz

Information

Model I am using (Bert, XLNet ...): Default QuestionAnsweringPipeline

The problem arises when using:

The tasks I am working on is:

  • an official GLUE/SQUaD task: Extractive Question Answering

To reproduce

Steps to reproduce the behavior:

  1. Install transformers 3.5.1 (also in 3.5.0)
  2. Run the following:
from transformers import pipeline

nlp = pipeline("question-answering")

context = r"""
Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
a model on a SQuAD task, you may leverage the `run_squad.py`.
"""

print(
    nlp(
        question=["What is extractive question answering?", "What is a good example of a question answering dataset?"],
        context=[context, context],
    )
)

In versions 3.5.0 and 3.5.1, I have this error:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/cytadel/.pyenv/versions/3.7.5/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/Users/cytadel/.pyenv/versions/3.7.5/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/transformers/data/processors/squad.py", line 110, in squad_convert_example_to_features
    for (i, token) in enumerate(example.doc_tokens):
AttributeError: 'list' object has no attribute 'doc_tokens'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/cytadel/feedly/ml/do_not_commit.py", line 14, in <module>
    context=[context, context],
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/transformers/pipelines.py", line 1787, in __call__
    for example in examples
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/transformers/pipelines.py", line 1787, in <listcomp>
    for example in examples
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/transformers/data/processors/squad.py", line 368, in squad_convert_examples_to_features
    disable=not tqdm_enabled,
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/tqdm/std.py", line 1171, in __iter__
    for obj in iterable:
  File "/Users/cytadel/.pyenv/versions/3.7.5/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 325, in <genexpr>
    return (item for chunk in result for item in chunk)
  File "/Users/cytadel/.pyenv/versions/3.7.5/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 748, in next
    raise value
AttributeError: 'list' object has no attribute 'doc_tokens'

Expected behavior

Same result as in transformers version 3.4.0:

[{'score': 0.6222442984580994, 'start': 34, 'end': 96, 'answer': 'the task of extracting an answer from a text given a question.'}, {'score': 0.5115318894386292, 'start': 147, 'end': 161, 'answer': 'SQuAD dataset,'}]

@LysandreJik
Copy link
Member

Thank you for reporting this. Fixing it in #8765

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants