Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classes fail to pickle in Python 3 when super is referenced when dumping session #300

Closed
charlesccychen opened this issue Feb 2, 2019 · 21 comments · Fixed by #443
Closed
Labels
Milestone

Comments

@charlesccychen
Copy link

charlesccychen commented Feb 2, 2019

This is a problematic scenario on Python 3 (tested to be failing with Python 3.5 and 3.6). The following code dumps the main session:

import dill

class A(object):
	pass

class B(A):
	def __init__(self):
		super(B, self).__init__()

dill.dump_session('test4.dump')

When this is loaded, it causes an error:

root@c3690280664a:/# python -c 'import dill; dill.load_session("test4.dump")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 410, in load_session
    module = unpickler.load()
  File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 474, in find_class
    return StockUnpickler.find_class(self, module, name)
AttributeError: Can't get attribute 'B' on <module '__main__' (built-in)>

For reference, this is the disassembled pickle output:

root@c3690280664a:/# python -c 'import pickletools; pickletools.dis(open("test4.dump", "rb").read())'
    0: \x80 PROTO      3
    2: c    GLOBAL     'dill._dill _import_module'
   29: q    BINPUT     0
   31: X    BINUNICODE '__main__'
   44: q    BINPUT     1
   46: \x85 TUPLE1
   47: q    BINPUT     2
   49: R    REDUCE
   50: q    BINPUT     3
   52: }    EMPTY_DICT
   53: q    BINPUT     4
   55: (    MARK
   56: X        BINUNICODE '__name__'
   69: q        BINPUT     5
   71: h        BINGET     1
   73: X        BINUNICODE '__doc__'
   85: q        BINPUT     6
   87: N        NONE
   88: X        BINUNICODE '__package__'
  104: q        BINPUT     7
  106: N        NONE
  107: X        BINUNICODE '__spec__'
  120: q        BINPUT     8
  122: N        NONE
  123: X        BINUNICODE '__annotations__'
  143: q        BINPUT     9
  145: }        EMPTY_DICT
  146: q        BINPUT     10
  148: X        BINUNICODE '__file__'
  161: q        BINPUT     11
  163: X        BINUNICODE '1.py'
  172: q        BINPUT     12
  174: X        BINUNICODE '__cached__'
  189: q        BINPUT     13
  191: N        NONE
  192: X        BINUNICODE 'dill'
  201: q        BINPUT     14
  203: h        BINGET     0
  205: h        BINGET     14
  207: \x85     TUPLE1
  208: q        BINPUT     15
  210: R        REDUCE
  211: q        BINPUT     16
  213: X        BINUNICODE 'A'
  219: q        BINPUT     17
  221: c        GLOBAL     'dill._dill _create_type'
  246: q        BINPUT     18
  248: (        MARK
  249: c            GLOBAL     'dill._dill _load_type'
  272: q            BINPUT     19
  274: X            BINUNICODE 'type'
  283: q            BINPUT     20
  285: \x85         TUPLE1
  286: q            BINPUT     21
  288: R            REDUCE
  289: q            BINPUT     22
  291: h            BINGET     17
  293: h            BINGET     19
  295: X            BINUNICODE 'object'
  306: q            BINPUT     23
  308: \x85         TUPLE1
  309: q            BINPUT     24
  311: R            REDUCE
  312: q            BINPUT     25
  314: \x85         TUPLE1
  315: q            BINPUT     26
  317: }            EMPTY_DICT
  318: q            BINPUT     27
  320: (            MARK
  321: X                BINUNICODE '__module__'
  336: q                BINPUT     28
  338: h                BINGET     1
  340: h                BINGET     6
  342: N                NONE
  343: u                SETITEMS   (MARK at 320)
  344: t            TUPLE      (MARK at 248)
  345: q        BINPUT     29
  347: R        REDUCE
  348: q        BINPUT     30
  350: X        BINUNICODE 'B'
  356: q        BINPUT     31
  358: h        BINGET     18
  360: (        MARK
  361: h            BINGET     22
  363: h            BINGET     31
  365: h            BINGET     30
  367: \x85         TUPLE1
  368: q            BINPUT     32
  370: }            EMPTY_DICT
  371: q            BINPUT     33
  373: (            MARK
  374: h                BINGET     28
  376: h                BINGET     1
  378: X                BINUNICODE '__init__'
  391: q                BINPUT     34
  393: c                GLOBAL     'dill._dill _create_function'
  422: q                BINPUT     35
  424: (                MARK
  425: h                    BINGET     19
  427: X                    BINUNICODE 'CodeType'
  440: q                    BINPUT     36
  442: \x85                 TUPLE1
  443: q                    BINPUT     37
  445: R                    REDUCE
  446: q                    BINPUT     38
  448: (                    MARK
  449: K                        BININT1    1
  451: K                        BININT1    0
  453: K                        BININT1    1
  455: K                        BININT1    3
  457: K                        BININT1    3
  459: C                        SHORT_BINBYTES b't\x00t\x01|\x00\x83\x02j\x02\x83\x00\x01\x00d\x00S\x00'
  479: q                        BINPUT     39
  481: N                        NONE
  482: \x85                     TUPLE1
  483: q                        BINPUT     40
  485: X                        BINUNICODE 'super'
  495: q                        BINPUT     41
  497: h                        BINGET     31
  499: h                        BINGET     34
  501: \x87                     TUPLE3
  502: q                        BINPUT     42
  504: X                        BINUNICODE 'self'
  513: q                        BINPUT     43
  515: \x85                     TUPLE1
  516: q                        BINPUT     44
  518: X                        BINUNICODE '1.py'
  527: q                        BINPUT     45
  529: h                        BINGET     34
  531: K                        BININT1    6
  533: C                        SHORT_BINBYTES b'\x00\x01'
  537: q                        BINPUT     46
  539: X                        BINUNICODE '__class__'
  553: q                        BINPUT     47
  555: \x85                     TUPLE1
  556: q                        BINPUT     48
  558: )                        EMPTY_TUPLE
  559: t                        TUPLE      (MARK at 448)
  560: q                    BINPUT     49
  562: R                    REDUCE
  563: q                    BINPUT     50
  565: c                    GLOBAL     '__builtin__ __main__'
  587: h                    BINGET     34
  589: N                    NONE
  590: c                    GLOBAL     'dill._dill _create_cell'
  615: q                    BINPUT     51
  617: c                    GLOBAL     '__main__ B'
  629: q                    BINPUT     52
  631: \x85                 TUPLE1
  632: q                    BINPUT     53
  634: R                    REDUCE
  635: q                    BINPUT     54
  637: \x85                 TUPLE1
  638: q                    BINPUT     55
  640: }                    EMPTY_DICT
  641: q                    BINPUT     56
  643: t                    TUPLE      (MARK at 424)
  644: q                BINPUT     57
  646: R                REDUCE
  647: q                BINPUT     58
  649: h                BINGET     6
  651: N                NONE
  652: u                SETITEMS   (MARK at 373)
  653: t            TUPLE      (MARK at 360)
  654: q        BINPUT     59
  656: R        REDUCE
  657: 0        POP
  658: h        BINGET     52
  660: u        SETITEMS   (MARK at 55)
  661: b    BUILD
  662: .    STOP
highest protocol among opcodes = 3

CC: @aaltay, @tvalentyn, @markflyhigh

@charlesccychen
Copy link
Author

CC: @robertwb

@ghost
Copy link

ghost commented Feb 22, 2019

Just adding that (after seeing this post) I found that removing uses of super() in my code solved some problems I was having with dill as well.

Edit: For reference, I wasn't trying to save a session, I was experiencing the same behavior (and solution) with dilling and undilling a class instance.

@robb-brown
Copy link

robb-brown commented May 23, 2019

Any comments from the developers on this? It seems likely to be related to #56, #75, #209.

A minimal example that causes this issue is:

import dill
class A():
	def __init__(self,**args):
		super

a = A()
with open('a.dill','wb') as f:
	dill.dump(a,f)

Loading a.dill gives the error:

dill/_dill.py", line 474, in find_class
    return StockUnpickler.find_class(self, module, name)
AttributeError: Can't get attribute 'A' on <module '__main__' (built-in)>

It doesn't seem to matter whether you use the old or new style super, or whether dump is byref True or False.

@dkolobok
Copy link

I'm having the same issue. Are there any news regarding it?

@robb-brown
Copy link

robb-brown commented Jun 24, 2019

I had the same issue. It sounds like the solution in dill might be difficult, but there’s an easy workaround. For a class declared in main, for example:

class MyClass(SuperClass)

instead of using super() use the explicit class name:

SuperClass.__init__()

If you change the super class you have to remember to change the calls as well, but since it only affects classes declared in main it’s not too bad.

@mmckerns
Copy link
Member

mmckerns commented Jun 26, 2019

@robb-brown: you are correct, this issue has the same root as #56 (and the others you referenced). I'm closing this as duplicate. We are well aware of the issue, but don't have a robust solution for it yet.

@charlesccychen: thanks for reporting this issue.

@mmckerns mmckerns added this to the dill-0.3.0 milestone Jun 26, 2019
@tvalentyn
Copy link

tvalentyn commented Aug 26, 2019

@mmckerns Do we by chance have any details (perhaps in some other issue) that clarify why #56 is the rootcause of this issue? Note that this issue manifests on Python 3 only, while #56 affects Python 2 and Python 3 according to the discussion on #56.

Also, this issue is reproducible with dill==0.3.0, although this issue was added to dill-0.3.0 milestone.

@tvalentyn
Copy link

We can see that disassembled pickled output refers to class A in global namespace:

import dill                                                                                                                                                                                                        
import pickletools                                                              
                                                                                
class A():                                                                      
  def __init__(self,**args):                                                    
    super                                                                       
                                                                                
a = A()                                                                                                                                                         
dump = dill.dumps(a)                                                            
del A   # comment this line for snippet to succeed.                                                                                                                                                       
pickletools.dis(dump, annotate=1)                                               
                                                                                
dill.loads(dump) # fails.
...
  356: c                GLOBAL     'dill._dill _create_cell' Push a global object (module.attr) on the stack.
  381: q                BINPUT     31 Store the stack top into the memo.  The stack is not popped.
  383: c                GLOBAL     '__main__ A' Push a global object (module.attr) on the stack.
  395: q                BINPUT     32           Store the stack top into the memo.  The stack is not popped.
...

This looks very similiar to disassembled output of standard pickler obtained via pickletools.dis(pickle.dumps(a)):

    0: \x80 PROTO      3
    2: c    GLOBAL     '__main__ A'
   14: q    BINPUT     0
   16: )    EMPTY_TUPLE
   17: \x81 NEWOBJ
   18: q    BINPUT     1
   20: .    STOP

This may explain the error:

AttributeError: Can't get attribute 'A' on <module '__main__' from 'dill_issue_300.py'>

Unpickling succeeds if class A is defined in current main session.

@tvalentyn
Copy link

tvalentyn commented Aug 27, 2019

#75 is closely related. As a workaround for #75, Dill pickles objects that invoke super by reference:

if _super: pickler._byref = True
. This cancels the setting which disables pickling by reference set by dump_session :
pickler._byref = False # disable pickling by name reference
therefore restoring a main session fails when we encounter objects pickled by reference.
Note that on Python 2, for examples mentioned here, pickling-by-reference is not triggered, because expression
_super = ('super' in getattr(obj.func_code,'co_names',())) and (_byref is not None) and getattr(pickler, '_recurse', False)
evaluates to False for these examples.

@tvalentyn
Copy link

I think we should reopen this issue as a separate issue, wdyt @mmckerns ?

@mmckerns
Copy link
Member

Agreed.

@tvalentyn
Copy link

@mmckerns, I've noticed this issue continues to change the milestone label; since it is currently tagged in 0.3.3 milestone, and there is only one other issue in 0.3.3 milestone, does this mean that this issue is considered a high priority for the next milestone, or the label serves some other purpose?
Regarding potential fix, instead of pickling all objects that invoke super() by reference, do you think a 2-step approach for pickling class instances that cloudpickle follows (first pickle class instances without __dict__, then populate the __dict__) can be accomplished in dill as well?

@mmckerns
Copy link
Member

Yes, it's a high-priority item. I'm aware of the cloudpickle approach. This is on my to-do short list, but just keeps getting bumped. I worked on it a bit for this latest release, but didn't get a solution incorporated.

@tvalentyn
Copy link

Good to know, thank you!

@mmckerns
Copy link
Member

mmckerns commented Nov 2, 2020

@tvalentyn: I had to push this out another week, and past the new release. It's still on my priority list (and fingers-crossed will get resolved, as scheduled, this month).

@mmckerns mmckerns modified the milestones: dill-0.3.3, dill-0.3.4 Nov 2, 2020
@elmahyai
Copy link

The problem is still not solved :(

@robb-brown
Copy link

@ahmedelmahy If you're stuck, there's a pretty simple workaround:

#300 (comment)

@elmahyai
Copy link

elmahyai commented Mar 14, 2021

Thanks @robb-brown
That worked for me but only after I set the argument byref to False for the dump function.


with open('data.pkl', 'wb') as f:
        dill.dump(myobj, f, byref=False)

@anivegesana
Copy link
Contributor

@mmckerns My PR (#443) seems to solve this issue for Python 3.0+.

@FruitfulApproach
Copy link

Wtf, it's not unpickling properly over here either. Super-class never called no matter what I do. In fact init / setstate never freakin called!

@mmckerns
Copy link
Member

@FruitfulApproach: this issue is resolved. So, if you are experiencing issues that you feel related to this one, then please open a new issue -- you can reference this one -- but provide a minimal example that demonstrates what you are experiencing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants