Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to module-based singleton for ParslSerializer #2467

Merged
merged 2 commits into from
Nov 22, 2022

Conversation

benclifford
Copy link
Collaborator

@benclifford benclifford commented Nov 15, 2022

This fixes some problems where the class based serializer object was sometimes serialized and sent over the network; there is no reasonable behaviour for deserializing a singleton, but in the previous implementation, the caches used for serialization were overwritten.

The ParslSerializer class is removed entirely and the parsl.serialize.facade is the singleton instead. This is a more pythonic way of representing singletons.

Use git show -w on this commit to see that most of the usual diff is only whitespace indentation changes.

To identify places where a ParslSerializer object was deserialized, I added a __setstate__ method which is called whenever a ParslSerializer is deserialized.

This fired when using monitoring, as monitoring carries round with it quite a lot of serialization overhead.

@@ -171,3 +171,6 @@ class ParslSerializer(object):
         assert len(unpacked) == 3, "Unpack expects 3 buffers, got {}".format(len(unpacked))

         return unpacked
+
+    def __setstate__(self, state):
+        raise ValueError("BENC: serializer setstate called")

Type of change

  • Bug fix (non-breaking change that fixes an issue)

This fixes some problems where the class based serializer object was sometimes
serialized and sent over the network; there is no reasonable behaviour for
deserializing a singleton, but in the previous implementation, the caches used
for serialization were overwritten.

The ParslSerializer class is removed entirely and the parsl.serialize.facade
is the singleton instead. This is a more pythonic way of representing
singletons.

Use `git show -w` on this commit to see that most of the usual
diff is only whitespace indentation changes.

To identify places where a ParslSerializer object was deserialized, I added a
__setstate__ method which is called whenever a ParslSerializer is deserialized.

This fired when using monitoring, as monitoring carries round with it quite a
lot of serialization overhead.

@@ -171,3 +171,6 @@ class ParslSerializer(object):
         assert len(unpacked) == 3, "Unpack expects 3 buffers, got {}".format(len(unpacked))

         return unpacked
+
+    def __setstate__(self, state):
+        raise ValueError("BENC: serializer setstate called")
@benclifford benclifford merged commit 56491bc into master Nov 22, 2022
@benclifford benclifford deleted the benc-serializer-singleton branch November 22, 2022 14:02
benclifford added a commit that referenced this pull request May 19, 2023
This inner loop populates a table of data serializers, which should happen
at the same level as populating the table of code serializers, rather than
as an inner loop of code serializer population.

This inner loop is idempotent, so the behaviour should mostly be unchanged
by this PR, but in the obscure situation that someone modifies serialisation
so that it has no code serializers, only data serializers, then before this
PR, I think those data serializers would not be visible to users, as the
inner loop would never be executed.

This was introduced by PR #2467 which moved away from using a class singleton
to using the module as the singleton.
benclifford added a commit that referenced this pull request May 23, 2023
This inner loop populates a table of data serializers, which should happen
at the same level as populating the table of code serializers, rather than
as an inner loop of code serializer population.

This inner loop is idempotent, so the behaviour should mostly be unchanged
by this PR, but in the obscure situation that someone modifies serialisation
so that it has no code serializers, only data serializers, then before this
PR, I think those data serializers would not be visible to users, as the
inner loop would never be executed.

This was introduced by PR #2467 which moved away from using a class singleton
to using the module as the singleton.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant