Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][client] Fix deadlock when sending chunked messages with BlockIFQueueFull enabled #5

Closed
wants to merge 5 commits into from

Conversation

RobertIndie
Copy link
Owner

@RobertIndie RobertIndie commented Sep 22, 2022

Fixes #

Master Issue: #

Motivation

When the isBlockIfQueueFull is enabled for the producer, if the producer sends a very large message(# of total chunks > maxPendingMessages) or the producer sends too many large messages concurrently, there will be deadlock when the producer tries too acquire the permit before publishing these messages.

The root cause is that the producer will try to acquire all permits for all chunks before publishing them. It will be blocked if there are not enough permits. For example, if the payload size of the message is 55 MB which will be split into 11 chunks and the maxPendingMessages size is 10, it will be blocked forever. Even if the message is not too large(like 30MB) but sends 5 such messages concurrently, it may also be blocked forever.

Modifications

  • If the BlockIfQueueFull is enabled, the producer will only acquire a single permit for each chunk before publishing them.

This PR only affects the case of BlockIfQueueFull enabled when sending chunked messages, and it will not affect other existing behaviors.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • Anything that affects deployment

Documentation

  • doc-required
    (Your PR needs to update docs and you will update later)

  • doc-not-needed
    (Please explain why)

  • doc
    (Your PR contains doc changes)

  • doc-complete
    (Docs have been already added)

Matching PR in forked repository

PR in forked repository: <!-- ENTER URL HERE

After opening this PR, the build in apache/pulsar will fail and instructions will
be provided for opening a PR in the PR author's forked repository.

apache/pulsar pull requests should be first tested in your own fork since the
apache/pulsar CI based on GitHub Actions has constrained resources and quota.
GitHub Actions provides separate quota for pull requests that are executed in
a forked repository.

The tests will be run in the forked repository until all PR review comments have
been handled, the tests pass and the PR is approved by a reviewer.

-->

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant