Version 3.5 broke the multi context/questions feature for the QuestionAnsweringPipeline #8759

Mathieu4141 · 2020-11-24T12:59:28Z

Environment info

transformers version: 3.5.1 (also in 3.5.0)
Platform: Darwin-20.1.0-x86_64-i386-64bit
Python version: 3.7.5
PyTorch version (GPU?): 1.7.0 (False)
Tensorflow version (GPU?): 2.3.1 (False)
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Who can help

tokenizers: @mfuntowicz

Information

Model I am using (Bert, XLNet ...): Default QuestionAnsweringPipeline

The problem arises when using:

my own modified scripts: (see below, modified from the example given here https://huggingface.co/transformers/usage.html#extractive-question-answering)

The tasks I am working on is:

an official GLUE/SQUaD task: Extractive Question Answering

To reproduce

Steps to reproduce the behavior:

Install transformers 3.5.1 (also in 3.5.0)
Run the following:

from transformers import pipeline

nlp = pipeline("question-answering")

context = r"""
Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
a model on a SQuAD task, you may leverage the `run_squad.py`.
"""

print(
    nlp(
        question=["What is extractive question answering?", "What is a good example of a question answering dataset?"],
        context=[context, context],
    )
)

In versions 3.5.0 and 3.5.1, I have this error:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/cytadel/.pyenv/versions/3.7.5/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/Users/cytadel/.pyenv/versions/3.7.5/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/transformers/data/processors/squad.py", line 110, in squad_convert_example_to_features
    for (i, token) in enumerate(example.doc_tokens):
AttributeError: 'list' object has no attribute 'doc_tokens'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/cytadel/feedly/ml/do_not_commit.py", line 14, in <module>
    context=[context, context],
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/transformers/pipelines.py", line 1787, in __call__
    for example in examples
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/transformers/pipelines.py", line 1787, in <listcomp>
    for example in examples
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/transformers/data/processors/squad.py", line 368, in squad_convert_examples_to_features
    disable=not tqdm_enabled,
  File "/Users/cytadel/Library/Caches/pypoetry/virtualenvs/feedly.ml-cyber-attacks-4LjjtgqO-py3.7/lib/python3.7/site-packages/tqdm/std.py", line 1171, in __iter__
    for obj in iterable:
  File "/Users/cytadel/.pyenv/versions/3.7.5/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 325, in <genexpr>
    return (item for chunk in result for item in chunk)
  File "/Users/cytadel/.pyenv/versions/3.7.5/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 748, in next
    raise value
AttributeError: 'list' object has no attribute 'doc_tokens'

Expected behavior

Same result as in transformers version 3.4.0:

[{'score': 0.6222442984580994, 'start': 34, 'end': 96, 'answer': 'the task of extracting an answer from a text given a question.'}, {'score': 0.5115318894386292, 'start': 147, 'end': 161, 'answer': 'SQuAD dataset,'}]

The text was updated successfully, but these errors were encountered:

LysandreJik · 2020-11-24T15:47:03Z

Thank you for reporting this. Fixing it in #8765

LysandreJik self-assigned this Nov 24, 2020

LysandreJik mentioned this issue Nov 24, 2020

Fix QA argument handler #8765

Merged

LysandreJik closed this as completed in #8765 Nov 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 3.5 broke the multi context/questions feature for the QuestionAnsweringPipeline #8759

Version 3.5 broke the multi context/questions feature for the QuestionAnsweringPipeline #8759

Mathieu4141 commented Nov 24, 2020

LysandreJik commented Nov 24, 2020

Version 3.5 broke the multi context/questions feature for the QuestionAnsweringPipeline #8759

Version 3.5 broke the multi context/questions feature for the QuestionAnsweringPipeline #8759

Comments

Mathieu4141 commented Nov 24, 2020

Environment info

Who can help

Information

To reproduce

Expected behavior

LysandreJik commented Nov 24, 2020