feat(pubsub): ordering keys #26

pradn · 2020-02-04T21:10:41Z

Fixes #6

plamut

As discussed on the original PR, a system test would be very good to have, although that can be added separately.

There is a possibility that the code temporarily deviates from the FlowControl.max_bytes limit in certain usage scenarios - which are believed to be atypical - but I'd like to hear from @kamalaboulhosn for the final ruling on that.

The PR still needs changes from the latest master, GitHub reports that it's out of date and not mergeable yet.

kamalaboulhosn

Looks like pubsub/google/cloud/pubsub_v1/subscriber/_protocol/messages_on_hold.py didn't get pulled over.

google/cloud/pubsub_v1/publisher/client.py

kamalaboulhosn · 2020-02-05T17:01:32Z

As discussed on the original PR, a system test would be very good to have, although that can be added separately.

There is a possibility that the code temporarily deviates from the FlowControl.max_bytes limit in certain usage scenarios - which are believed to be atypical - but I'd like to hear from @kamalaboulhosn for the final ruling on that.

The PR still needs changes from the latest master, GitHub reports that it's out of date and not mergeable yet.

@plamut I'm actually okay with the deviation in this way. Here is the way I think about it: Max bytes flow control is to save the client from OOMing when the messages could vary, but the size of the message is the overwhelmingly interesting thing that takes up memory. Given that that message is already in memory in the client library and buffered, no more memory is being used to send it to the user callback.

If the interesting factor in memory utilization is state that has to be loaded due to the message, e.g., have to load a large record from a database or a file from GCS, then the number of messages is the better thing to use for flow control. For that flow control, I believe we won't deliver more messages unless we can.

pradn · 2020-02-05T17:07:09Z

@kamalaboulhosn I hadn't thought of that max-bytes limit as primarily an OOM-protection feature. I'll keep that in mind.

I actually just now tried the code changes that would need to be made to prevent this overflow. It's not much of a change, but it adds more ifs-ands-and-buts to the design, making it harder to understand. Ie: the invariant that keys in the "pending ordering keys" dict all have messages in flight is diluted: now, there's an extra possibility that the key is waiting to be "activated".

In any case, I prefer to keep things simple for now, because it's only gonna get more complicated when we do an optimization pass and fix bugs.

pradn · 2020-02-05T18:37:59Z

Going to release a version of the library before merging this branch.

camerondavison · 2020-03-19T18:08:19Z

we have been seeing this new assert trigger in production https://github.com/googleapis/python-pubsub/pull/26/files#diff-a9838428343bb40f2ac9b6cf2beb9f77R332-R335 I feel like if there is an error that it should try and create a new batch rather than hit that assert. given the rest of the code if I am not mistaken the assert should possible be moved down below

if not self.will_accept(message):
    return future

so that it will return None and create a new batch.

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Feb 4, 2020

feat(pubsub): ordering keys

c830b62

pradn force-pushed the ordering_keys branch from eda13d3 to c830b62 Compare February 4, 2020 21:11

Add missing files.

52f06f0

plamut reviewed Feb 5, 2020

View reviewed changes

kamalaboulhosn reviewed Feb 5, 2020

View reviewed changes

google/cloud/pubsub_v1/publisher/client.py Outdated Show resolved Hide resolved

google/cloud/pubsub_v1/publisher/client.py Show resolved Hide resolved

pradn added 3 commits February 5, 2020 10:57

Add more missing files.

b8f2b0e

Merge branch 'master' into ordering_keys

b8481dd

Preserve backwards compatibility of publisher client parameters.

d688a81

kamalaboulhosn approved these changes Feb 5, 2020

View reviewed changes

pradn added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Feb 5, 2020

Merge branch 'master' into ordering_keys

9dcf6e9

pradn merged commit cc3093a into googleapis:master Feb 5, 2020

pradn removed the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Feb 5, 2020

pradn deleted the ordering_keys branch February 5, 2020 22:07

This was referenced Feb 10, 2020

feat(pubsub): subscriber-side changes for ordering keys googleapis/google-cloud-python#10201

Closed

feat(pubsub): publish-side changes for ordering keys googleapis/google-cloud-python#9929

Closed

yoshi-automation mentioned this pull request Feb 19, 2020

[CHANGE ME] Re-generated to pick up changes in the API or client library generator. #33

Closed

camerondavison mentioned this pull request Mar 19, 2020

asserting throwing for async publish with currently errored batch #48

Closed

release-please bot mentioned this pull request Aug 11, 2022

chore(main): release 2.13.6 #761

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(pubsub): ordering keys #26

feat(pubsub): ordering keys #26

pradn commented Feb 4, 2020

plamut left a comment •

edited

Loading

kamalaboulhosn left a comment

kamalaboulhosn commented Feb 5, 2020

pradn commented Feb 5, 2020

pradn commented Feb 5, 2020

camerondavison commented Mar 19, 2020

feat(pubsub): ordering keys #26

feat(pubsub): ordering keys #26

Conversation

pradn commented Feb 4, 2020

plamut left a comment • edited Loading

Choose a reason for hiding this comment

kamalaboulhosn left a comment

Choose a reason for hiding this comment

kamalaboulhosn commented Feb 5, 2020

pradn commented Feb 5, 2020

pradn commented Feb 5, 2020

camerondavison commented Mar 19, 2020

plamut left a comment •

edited

Loading