contributions mismatch for nominal features #16

lboussengui · 2024-06-04T11:53:26Z

ebm2onnx version: 3.1.1
onnxruntime : 1.16.1
interpret : 0.4.2
Python version: 3.10.8
Operating System: MacOS

Description

I trained an EBM classification model. This model was initially saved in pickle format.

I used ebm2onnx as shown below to convert my model to the .onnx format.

I noticed that the contribution to the prediction for a test case is different for nominal type features when passing in onnx format; the contributions are set to zero.

Do you have an explanation for this ?

What I Did

import ebm2onnx
import pickle
import onnxruntime as rt

# load first EBM 
with open(f'{MODEL_PATH}ebm_first.pkl', 'rb') as f:
    ebm_first  = pickle.load(f)

# load dtypes saved during model training 
with open(f'{MODEL_PATH}training_dtypes_for_onnx.pkl', 'rb') as f:
    training_dtypes_for_onnx  = pickle.load(f)

# transform ebm to onnx 
onnx_model = ebm2onnx.to_onnx(
    model=ebm_first,
    predict_proba=True,  # Generate a dedicated output for probabilities
    explain=True,  # Generate a dedicated output for local explanations
    dtype=training_dtypes_for_onnx,
    name='DEFAULT',
)

Here are the result of local explanation with EBM pickle model for one example :

pred_pkl = ebm_first.explain_local(X_test, y_test)
pred_pkl.data(0)['scores']

result is

array([ 0.027,  0.416, -0.158,  0.388,  0.043,  0.   , -0.196,  0.051,
       -0.201, -0.032,  0.176,  0.151,  0.   ,  0.216,  0.2  ,  0.376,
        0.05 ,  0.022, -0.076,  0.028, -0.26 , -0.043,  0.173,  0.269,
       -0.203, -0.025,  0.037, -0.056,  0.164,  0.296,  0.089,  0.08 ,
        0.1  ,  0.098, -0.018, -0.002, -0.001, -0.001, -0.003, -0.002])

After transforming ebm_first to onnx_model i did the following to imitate inference in production:

onnx_model.ir_version = 9
ebm_onnx = rt.InferenceSession(onnx_model.SerializeToString())
pred_onnx = ebm_onnx.run(None, X_test.to_dict("list"))

# contributions of pred_onnx 
pred_onnx[2][0][:, 0]

result is

array([ 0.027,  0.416, -0.158,  0.388,  0.   ,  0.   , -0.196,  0.051,
       -0.201, -0.032,  0.176,  0.151,  0.   ,  0.216,  0.2  ,  0.376,
        0.05 ,  0.022, -0.076,  0.028, -0.26 , -0.043,  0.173,  0.269,
       -0.203, -0.025,  0.037, -0.056,  0.164,  0.296,  0.089,  0.08 ,
        0.1  ,  0.098, -0.018, -0.002, -0.001, -0.001, -0.003, -0.002],
      dtype=float32)

The two arrays are not equal in index 4 and 5; the only nominal features of the dataset.

The text was updated successfully, but these errors were encountered:

MainRo · 2024-06-20T15:38:11Z

Is it possible for you to publish here a model and sample utterance that reproduces the issue?
In the meantime I will look at reproducing this in a unit test

MainRo · 2024-07-25T15:50:48Z

Could you confirm that the nominal features are of type boolean?
If this is the case, then can you try to explicitly convert them to 0/1 before calling ebm_first.explain_local:

X_test['feature'] = np.where(X_test['feature'] == False, 0, 1)

I suspect an issue in the interpret explain_local implementation. It looks like the boolean features are not correctly mapped, and have scores of 0.0.

MainRo · 2024-07-25T15:58:51Z

ok forget my last comment, the problem is that the conversion to onnx mutates the ebm model object.
if you call ebm_first.explain_local before converting to onnx you will have the same values.

obviously, this is not a normal behavior of the converter. I will fix this.

This prevents from using the ebm model correctly after the conversion. Fixes #16

This prevents from using the ebm model correctly after the conversion. Fixes #16 Signed-off-by: Romain Picard <[email protected]>

MainRo added the bug Something isn't working label Jun 20, 2024

MainRo self-assigned this Jun 20, 2024

MainRo added a commit that referenced this issue Jul 25, 2024

do not mutate the ebm model during conversion

7b7b3cc

This prevents from using the ebm model correctly after the conversion. Fixes #16

MainRo mentioned this issue Jul 25, 2024

do not mutate the ebm model during conversion #19

Merged

MainRo added a commit that referenced this issue Jul 25, 2024

do not mutate the ebm model during conversion

86cc6e8

This prevents from using the ebm model correctly after the conversion. Fixes #16 Signed-off-by: Romain Picard <[email protected]>

MainRo closed this as completed in #19 Jul 25, 2024

MainRo closed this as completed in c4bf7ed Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contributions mismatch for nominal features #16

contributions mismatch for nominal features #16

lboussengui commented Jun 4, 2024

MainRo commented Jun 20, 2024

MainRo commented Jul 25, 2024

MainRo commented Jul 25, 2024 •

edited

Loading

contributions mismatch for nominal features #16

contributions mismatch for nominal features #16

Comments

lboussengui commented Jun 4, 2024

Description

What I Did

MainRo commented Jun 20, 2024

MainRo commented Jul 25, 2024

MainRo commented Jul 25, 2024 • edited Loading

MainRo commented Jul 25, 2024 •

edited

Loading