Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-7632: Allow fine-grained configuration for compression #5927

Closed
wants to merge 19 commits into from

Conversation

dongjinleekr
Copy link
Contributor

This PR implements KIP-390: Add producer option to adjust compression level.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@dongjinleekr
Copy link
Contributor Author

Please note that this is a draft implementation. It will be updated following the update of zstd-jni by @luben. (see: luben/zstd-jni#80)

@luben
Copy link

luben commented Nov 19, 2018

I just shipped zstd-jni-1.3.7-2 that adds methods to query min/max supported compression levels.

@dongjinleekr
Copy link
Contributor Author

I just applied zstd-jni-1.3.7-2. Thanks again @luben!!

@dongjinleekr
Copy link
Contributor Author

And there is another issue here; as of now, Kafka's checkstyle rule disallows more than 13 parameters for any method. However, with introducing a new configuration entity, the constructor of MemoryRecordsBuilder (line 86, 160) requires 14 parameters - it is why the automated check fails. To complete this PR, this problem should also be solved.

@dongjinleekr dongjinleekr force-pushed the feature/KAFKA-7632 branch 2 times, most recently from b8e939c to 2d3b64a Compare December 28, 2018 14:57
@dongjinleekr dongjinleekr changed the title KAFKA-7632: Add producer option to adjust compression level KAFKA-7632: Allow fine-grained configuration for compression Jan 6, 2019
@dongjinleekr dongjinleekr force-pushed the feature/KAFKA-7632 branch 2 times, most recently from a7be368 to 8427a03 Compare January 22, 2019 16:03
…ampType, long, long, long, short, int, boolean, boolean, int, int): test-only method.
…tampType, long, long, long, short, int, boolean, int)
…tampType, long, long, long, short, int, boolean, boolean, int)
…ds#builder, instead of instantiating MemoryRecordsBuilder directly
…ead of instantiating MemoryRecordsBuilder directly
…okerCompressionCodec#getTargetCompressionCodec(String, CompressionCodec)
…Codec#getTargetCompressionType. (returns CompressionType)

1. LogAppendInfo, LogValidator now uses CompressionType instead of CompressionCodec.
2. All related tests (BrokerCompressionTest, LogValidatorTest, AbstractFetcherThreadTest) are updated accordingly.
…pression level, block size configuration for lz4.

1. Make KafkaLZ4BlockOutputStream#BLOCKSIZE_[64KB,256KB,1MB,4MB] private.
2. Add KafkaLZ4BlockOutputStream#AVAILABLE_BLOCKSIZES.
3. Remove KafkaLZ4BlockOutputStream(OutputStream, int, boolean), KafkaLZ4BlockOutputStream(OutputStream, int), KafkaLZ4BlockOutputStream(OutputStream): not used.
4. Remove blockChecksum from KafkaLZ4BlockOutputStream(OutputStream, int, boolean, boolean): it is always false. FLG instance is now instantiated with FLG(), which is equivalent to FLG(false). To support block size and compression level, it now receives LZ4Compressor and BD instance as a parameter. Now, KafkaLZ4BlockOutputStream#of creates LZ4Compressor, BD instance with useBrokenFlagDescriptorChecksum parameter as per compression level, block size, and magic. CompressionType.LZ4#wrapForOutput now calls KafkaLZ4BlockOutputStream#of instead of KafkaLZ4BlockOutputStream(OutputStream, boolean); KafkaLZ4BlockOutputStream(OutputStream, boolean) is also removed for it is not used anymore.
5. KafkaLZ4Test: add magic, compressionLevel, blockSize as testing parameters. useBrokenFlagDescriptorChecksum is now determined by magic. blockChecksum is removed.
1. Add CompressionConfig: without this class, the constructor of MemoryRecordsBuilder violates parameter count limit. (it aready has 13 parameters.)
2. MemoryRecordsBuilder now uses CompressionConfig instead of CompressionType.
3. Add level, blockSize parameters to CompressionType#wrapForOutput.
4. Add compressionLevel, compressionBufferSize, compressionConfig to MemoryRecords.Builder.
1. Make LogValidator#validateMessagesAndAssignOffsets to use CompressionType and CompressionConfig; not CompressionCodec.
@dongjinleekr
Copy link
Contributor Author

Here is the re-implementation of KIP-390, rebased against the latest trunk. The commits as of present consist like the following:

  1. Refactor the existing code.
    1. Replace current MemoryRecords#builder implementation with a new one with a fluent method call. It enables MemoryRecords#builder to support more than 13 parameters, also making the rebase against the latest trunk easier.
    2. Change BrokerCompressionCodec#getTargetCompressionCodec to return CompressionType instead of CompressionCodec. It enables the introduction of a new class, CompressionConfig. (A new test method for validation was also added.)
  2. Extend the compression functionality to support compression level and buffer size.
    1. Refactor KafkaLZ4BlockOutputStream, KafkaLZ4Test and add support for compression level, block size configuration to lz4.
    2. Add GZipOutputStream to support compression level, block size for Gzip.
    3. Add CompressionConfig which encapsulates compression type, level, and buffer size, and make Producer and Broker use it.
  3. Implement two alternative public interfaces that are under discussion:
    • Type A: Adds compression.level and compression.buffer.size.
    • Type B: Adds compression.config (like listener.security.protocol.map or max.connections.per.ip.overrides)

Since the public interface is not settled yet, this PR currently includes the common commits by step 2. For the final ones, please refer the following branches:

@dongjinleekr
Copy link
Contributor Author

dongjinleekr commented Jun 5, 2021

Closes this PR in preference of a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants