RuntimeError: Not all operators have been evaluated. A variable name is probably misspelled. #817

ebolotin6 · 2022-01-28T01:02:59Z

Hello,

I have the following sklearn pipeline with a StackingClassifier that uses 2 XGB classifiers (stored in a dict) as estimators:

stacking_ensemble = StackingClassifier(
        estimators=list(map(tuple, classifiers.items())),
        stack_method='predict_proba',
        passthrough=False
    )

pipeline = Pipeline(steps=[
    ('cbe', ColTransformer()),
    ('sc', stacking_ensemble),
])
pipeline.fit(x_train, y_train)

If I try to convert stacking_ensemble to onnx on it's own - it works.
If I try to convert ColTransformer to onnx on it's own - it works.
If I try to convert a sklearn pipeline with ColTransformer and any other sklearn model (including ensemble models like voting classifier) - it works.

However when I try to convert the above pipeline (specifically with a StackingClassifier) to onnx, I get this:
RuntimeError: Not all operators have been evaluated. A variable name is probably misspelled.

With the only operator being is_eval=None as this one:

Operator(type='SklearnLinearClassifier', onnx_name='SklearnLinearClassifier', inputs='merged_probability_tensor', outputs='label3,probability_tensor2', raw_operator=LogisticRegression())

This operator is the final_estimator in the StackingClassifier per here, which defaults to a LogisticRegression classifier.

Do you know what the problem might be? Any help is greatly appreciated.

Thank you very much,
EB

The text was updated successfully, but these errors were encountered:

xadupre · 2022-02-01T12:03:30Z

Which version of scikit-learn are you using?

xadupre · 2022-02-01T12:19:34Z

I tried to cover your example by adding two unit tests but it did not fail for me. What are the differences between your model and the ones I added in PR #820?

ebolotin6 · 2022-02-01T22:03:51Z

Hello, huge thanks for your reply!

I'm using sklearn version: 1.0.1.

Attached is the graph of the pipeline. But first, let me clarify the pipeline from above:

pipeline = Pipeline(steps=[
    ('cbe', ColTransformer()),
    ('sc', stacking_ensemble),
])

In the above, ColTransformer is a transformer used for converting a dataframe or numpy matrix of mixed types (strings, ints, floats) into the same shaped output of floats. Its specific name (in the attached graph) is CatColEncoder and its main purpose is for encoding categorical columns. Some notes:

The output shape of the CatColEncoder is the same as its input shape.
The type of the input data to CatColEncoder can be a matrix of mixed types (strings, floats, ints), and the output type of CatColEncoder will be a float matrix.

Regarding the attached:

On line 122 is this: Identity: ['index1'] -> ['column_index'].

If you look at the graph, index1 is the first label-encoded column of CatColEncoder
- This appears out of nowhere it seems (?)
The variable name column_index is defined on line 74 of stacking.py (inside operator_converters dir):

column_index_name = scope.get_unique_variable_name('column_index')

However, the identity operation itself is defined on line 28 of pipelines.py:

    for fr, to in zip(outputs, operator.outputs):
        container.add_node(
            'Identity', fr.full_name, to.full_name,
            name=scope.get_unique_operator_name("Id" + operator.onnx_name))

There is another identity operation in stacking.py but ironically, I don't think it's responsible this operation: Identity: ['index1'] -> ['column_index']

I'm at a bit of a loss, not sure where this error is originating or what's causing the graph to break. The output of CatColEncoder successfully flows through every step of the pipeline - and conversion to onnx is successful when a voting classifier is used instead of a stacking classifier.

Any thoughts, hints, suggestions are very appreciated.

Thank you,
EB

xadupre · 2022-02-02T14:02:34Z

I'm puzzled. I tried to use a dataframe as an input but it still works (see last commit in the PR). And if ColTransformer is a custom transformer, the conversion should have failed telling there is no converter for this class unless you did. How do you convert the model?

ebolotin6 · 2022-02-03T00:33:58Z

I'm puzzled too, and I really want to get this to work. Attached is a zip file that contains a demo notebook that you can run. The converter is inside cce_onnx_converter.py.

catcol_demo.zip

Thanks again,
EB

xadupre · 2022-02-04T13:35:44Z

I tried your notebook but nothing fails for me. I then tried the following pipeline but still no failure. I did not find any model with StackingClassifier. Did I miss something?

pipeline = Pipeline(steps=[
    ('cbe', CatColEncoder(all_col_names=x_df.columns)),
    ('nan', SimpleImputer()),
    ('sc', LogisticRegression()),
])
pipeline.fit(x_df, y)

ebolotin6 · 2022-02-04T20:48:04Z

Hi, thanks for responding! Attached is a new notebook that fully demonstrates the bug (titled sc_bug.ipynb). The traceback is at the bottom. Notice 2 things from the traceback graph:

the final operator is set to is_eval=None
This identity appears out of nowhere (or atleast unintentionally): Identity: ['index1'] -> ['column_index']

And to reiterate: the problem is specifically with a pipeline that contains the StackingClassifier (other classifiers work fine):

catcol_demo.zip

Much appreciated,
EB

xadupre · 2022-02-09T13:37:29Z

I was finally able to replicate the bug and found the cause. I updated the PR to fix the bug. I should release a new version by the end of week.

* investigate an issue with StackingClassifier * fix issue 817

ebolotin6 · 2022-02-09T21:28:56Z

Excellent, glad that you've solved it!! I look forward to the next version and will test again!

Thanks much,
EB

xadupre mentioned this issue Feb 1, 2022

Fixes issue #817 and method _propagation_status #820

Merged

xadupre added a commit that referenced this issue Feb 9, 2022

Fixes issue #817 and method _propagation_status (#820)

8f5e6e2

* investigate an issue with StackingClassifier * fix issue 817

ebolotin6 closed this as completed Feb 9, 2022

DiTo97 mentioned this issue Feb 21, 2024

graph may be disconnected with StackingClassifier #1069

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Not all operators have been evaluated. A variable name is probably misspelled. #817

RuntimeError: Not all operators have been evaluated. A variable name is probably misspelled. #817

ebolotin6 commented Jan 28, 2022

xadupre commented Feb 1, 2022

xadupre commented Feb 1, 2022 •

edited

Loading

ebolotin6 commented Feb 1, 2022 •

edited

Loading

xadupre commented Feb 2, 2022 •

edited

Loading

ebolotin6 commented Feb 3, 2022

xadupre commented Feb 4, 2022

ebolotin6 commented Feb 4, 2022

xadupre commented Feb 9, 2022

ebolotin6 commented Feb 9, 2022

RuntimeError: Not all operators have been evaluated. A variable name is probably misspelled. #817

RuntimeError: Not all operators have been evaluated. A variable name is probably misspelled. #817

Comments

ebolotin6 commented Jan 28, 2022

xadupre commented Feb 1, 2022

xadupre commented Feb 1, 2022 • edited Loading

ebolotin6 commented Feb 1, 2022 • edited Loading

xadupre commented Feb 2, 2022 • edited Loading

ebolotin6 commented Feb 3, 2022

xadupre commented Feb 4, 2022

ebolotin6 commented Feb 4, 2022

xadupre commented Feb 9, 2022

ebolotin6 commented Feb 9, 2022

xadupre commented Feb 1, 2022 •

edited

Loading

ebolotin6 commented Feb 1, 2022 •

edited

Loading

xadupre commented Feb 2, 2022 •

edited

Loading