Implement compression and decompression for Huffman coding #39658
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces two new methods,
compress_encoded
anddecompress_encoded
, to improve Huffman encoding efficiency. The problem stems from that encoded binary strings are often long to store or transmit efficiently.The
compress_encoded
method ensures that an encoded binary string is padded to a multiple of 8 and then converts every 8-bit chunk into a single character. Thedecompress_encoded
method reverses this process by converting characters back into their corresponding binary representation and removing the added padding.These changes enhance storage efficiency while maintaining lossless reconstruction of the original data. Doctests confirm the correctness of the implementation, ensuring that encoding, compression, decompression, and decoding operations preserve the original input.
📝 Checklist