Error loading a saved LabelModel and using it to predict #1460

cdeepakroy · 2019-09-12T17:32:27Z

Issue description

I wanted to save a label model trained within a jupyter notebook and use it in standalone python scripts elsewhere.

I used snorkel.labeling.LabelModel.save() method to save the model. Then, I loaded the model using the snorkel.labeling.LabelModel.load() method and it throws the following error:

AttributeError: 'LabelModel' object has no attribute 'c_tree'

Code example/repro steps

import numpy as np
import snorkel.labeling

L_train = np.random.randint(-1, 2, size=(10**6, 10), dtype=np.int8)

lm = snorkel.labeling.LabelModel()
lm.fit(L_train)

lm.save('label_mode.pt')  # open this file and you will see the aforementioned error

lm2 = snorkel.labeling.LabelModel()
lm2.load('label_model.pt')
lm2.predict(L_train)  # throws AttributeError: 'LabelModel' object has no attribute 'c_tree'

Error stack trace

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-57-39ae7bf111ea> in <module>
      1 lm2 = snorkel.labeling.LabelModel()
      2 lm2.load('label_model.pt')
----> 3 lm2.predict(L_train)

~\AppData\Local\Continuum\anaconda3\envs\snorkel\lib\site-packages\snorkel\labeling\model\label_model.py in predict(self, L, return_probs, tie_break_policy)
    429         array([0, 1, 0])
    430         """
--> 431         Y_probs = self.predict_proba(L)
    432         Y_p = probs_to_preds(Y_probs, tie_break_policy)
    433         if return_probs:

~\AppData\Local\Continuum\anaconda3\envs\snorkel\lib\site-packages\snorkel\labeling\model\label_model.py in predict_proba(self, L)
    377         L_shift = L + 1  # convert to {0, 1, ..., k}
    378         self._set_constants(L_shift)
--> 379         L_aug = self._get_augmented_label_matrix(L_shift)
    380         mu = np.clip(self.mu.detach().clone().numpy(), 0.01, 0.99)
    381         jtm = np.ones(L_aug.shape[1])

~\AppData\Local\Continuum\anaconda3\envs\snorkel\lib\site-packages\snorkel\labeling\model\label_model.py in _get_augmented_label_matrix(self, L, higher_order)
    178                     [
    179                         j
--> 180                         for j in self.c_tree.nodes()
    181                         if i in self.c_tree.node[j]["members"]
    182                     ]

~\AppData\Local\Continuum\anaconda3\envs\snorkel\lib\site-packages\torch\nn\modules\module.py in __getattr__(self, name)
    537                 return modules[name]
    538         raise AttributeError("'{}' object has no attribute '{}'".format(
--> 539             type(self).__name__, name))
    540 
    541     def __setattr__(self, name, value):

AttributeError: 'LabelModel' object has no attribute 'c_tree'

System info

How you installed Snorkel (conda, pip, source): conda
OS: Windows 10
Python version: 3.7.4
Snorkel version: 0.9.0

The text was updated successfully, but these errors were encountered:

paroma · 2019-09-13T00:04:04Z

Thank you posting details of this error, we were able to reproduce it. The c_tree variable is created in the fit() method, which is why it throws this error. For now, training the second LabelModel instance with a dummy L matrix before loading the model should work.

We will fix this bug in the upcoming release.

Here's a modification to your example that will get it working:

import numpy as np
import snorkel.labeling

L_train = np.random.randint(-1, 2, size=(10**6, 10), dtype=np.int8)
lm = snorkel.labeling.LabelModel()
lm.fit(L_train)
lm.save('label_model.pt') 

#an additional call to .fit() with a dummy L here
L_train_dummy = np.random.randint(-1, 2, size=(10**6, 10), dtype=np.int8)
lm2 = snorkel.labeling.LabelModel()
lm2.fit(L_train_dummy)
lm2.load('label_model.pt')
lm2.predict(L_train)

#check predictions are as expected
original_preds = lm.predict(L_train)
loaded_preds = lm2.predict(L_train)
np.sum(original_preds != loaded_preds) #should return 0

cdeepakroy · 2019-09-13T16:32:51Z

@paroma Thanks a lot for looking into this and suggesting a work around.

Could you explain what c_tree attribute is? It seems to be of type networkx.classes.graph.Graph. In my case the graph has 10 nodes (equal to number of label functions) and 0 edges. Is this the graph encoding the graphical model representation of the relationships between random variables corresponding to the label functions and the predicted label Y? If so, I am concerned if refitting on the dummy matrix will learn wrong relationships between the random variables.

I tried to run print(lm.state_dict()) and found that it has a tensor called mu


OrderedDict([('mu', tensor([[0.3690, 0.2977],
        [0.2977, 0.3690],
        [0.3693, 0.2975],
        [0.2975, 0.3691],
        [0.3691, 0.2977],
        [0.2976, 0.3690],
        [0.3690, 0.2977],
        [0.2976, 0.3691],
        [0.3689, 0.2976],
        [0.2977, 0.3690],
        [0.3691, 0.2975],
        [0.2976, 0.3691],
        [0.3690, 0.2975],
        [0.2976, 0.3692],
        [0.3690, 0.2977],
        [0.2976, 0.3690],
        [0.3692, 0.2974],
        [0.2975, 0.3692],
        [0.3690, 0.2975],
        [0.2975, 0.3692]]))])

Could you explain what the tensor mu is used for or point me to a paper where the mu notation is used? In my case the shape of mu is torch.Size([20, 2]) for 10 labeling functions. If all the information needed for predict is present the state_dict, then I was thinking I can save the state_dict and try to write a standalone function that calculates the prediction.

paroma · 2019-09-20T02:28:36Z

mu and c_tree (junction tree) are defined in the AAAI'19 paper. And try out the new save/load methods for LabelModel, thanks for pointing this out!

cdeepakroy changed the title ~~Error saving trained LabelModel using save() method~~ Error loading a saved LabelModel and using it to predict Sep 12, 2019

paroma self-assigned this Sep 13, 2019

paroma added the bug label Sep 13, 2019

paroma mentioned this issue Sep 17, 2019

Saving all attributes of LabelModel #1463

Merged

5 tasks

paroma closed this as completed in #1463 Sep 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error loading a saved LabelModel and using it to predict #1460

Error loading a saved LabelModel and using it to predict #1460

cdeepakroy commented Sep 12, 2019 •

edited

Loading

paroma commented Sep 13, 2019

cdeepakroy commented Sep 13, 2019 •

edited

Loading

paroma commented Sep 20, 2019 •

edited

Loading

Error loading a saved LabelModel and using it to predict #1460

Error loading a saved LabelModel and using it to predict #1460

Comments

cdeepakroy commented Sep 12, 2019 • edited Loading

Issue description

Code example/repro steps

Error stack trace

System info

paroma commented Sep 13, 2019

cdeepakroy commented Sep 13, 2019 • edited Loading

paroma commented Sep 20, 2019 • edited Loading

cdeepakroy commented Sep 12, 2019 •

edited

Loading

cdeepakroy commented Sep 13, 2019 •

edited

Loading

paroma commented Sep 20, 2019 •

edited

Loading