Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update message header type to match #39

Open
wants to merge 5 commits into
base: dev_early_access_development_branch
Choose a base branch
from

Conversation

mkaufmaner
Copy link

@mkaufmaner mkaufmaner commented May 6, 2024

Cleans up the existing message headers type.

The existing MessageHeader type is just a plain object and not an array of plain objects.
@mkaufmaner mkaufmaner requested a review from a team as a code owner May 6, 2024 11:23
Copy link

cla-assistant bot commented May 6, 2024

CLA assistant check
All committers have signed the CLA.

@milindl milindl force-pushed the dev_early_access_development_branch branch 4 times, most recently from 6dd05b2 to 9cd609b Compare May 7, 2024 05:36
@milindl
Copy link
Contributor

milindl commented May 7, 2024

Thanks for the PR.
The intent of this is to allow number or nulls to be passed onto the produce headers. Is that correct?

Two things:

  1. Could you update the kafkajs.d.ts, too, to allow numbers and nulls? The interface IHeaders will need to be updated.

  2. While reviewing this PR, I realized that anything except a string is not correctly supported, since we're converting the bytes passed from JavaScript to a utf8 string before sending.
    If you send, for instance, a buffer with 0x00 somewhere within it, the library truncates the buffer right there. Similarly, if you pass null, the library creates the header with the string value "null".
    I'll fix it in a bit. I will maintain the current behaviour for string and numbers (since people might be relying on the conversion of numbers to strings), and change it to allow Buffers and nulls properly (which are somewhat unusable in the current state).
    There's another issue where if number of keys of MessageHeader is more than 1, only 1 key is used.
    I don't think this blocks the PR, I'm just posting this so you can keep it in mind while using it.

@milindl milindl self-requested a review May 7, 2024 09:38
@mkaufmaner
Copy link
Author

mkaufmaner commented May 7, 2024

Thanks for the PR. The intent of this is to allow number or nulls to be passed onto the produce headers. Is that correct?

@milindl Partially correct. Yes to add support for nulls and numbers as well as cleaning up the typings.

  1. Could you update the kafkajs.d.ts, too, to allow numbers and nulls? The interface IHeaders will need to be updated.

@milindl Done.

  1. While reviewing this PR, I realized that anything except a string is not correctly supported, since we're converting the bytes passed from JavaScript to a utf8 string before sending.
    If you send, for instance, a buffer with 0x00 somewhere within it, the library truncates the buffer right there. Similarly, if you pass null, the library creates the header with the string value "null".

I can confirm and I have replicated this rather unexpected behavior.

I'll fix it in a bit. I will maintain the current behaviour for string and numbers (since people might be relying on the conversion of numbers to strings), and change it to allow Buffers and nulls properly (which are somewhat unusable in the current state).

Thank you!

If I may make a recommendation, since this project is pre-GA, I would make all the breaking changes necessary to align with what would be considered expected behavior. In my opinion, the conversion of a primitive, like a number to a string, would not be considered expected behavior.

There's another issue where if number of keys of MessageHeader is more than 1, only 1 key is used.

I feel as if there is a larger discussion that needs to be had around how headers are implemented. I would honestly like to see headers passed as a plain object or map instead of an array of plain objects with a single property. I am recommending this because of both syntax and performance. The performance of implementations performing lookup operations on an array of headers have a time complexity of O(N). However, if the headers were an object then the
complexity can be reduced to O(1).

Preferably I would like to see;

export type MessageHeaderValue = Buffer | string | number | null;
export type MessageHeader = Record<string, MessageHeaderValue>;
export type MessageHeaders = MessageHeader | null | undefined;

@milindl
Copy link
Contributor

milindl commented May 9, 2024

Thanks for the PR update. I've been thinking about the conversion thing.

The Kafka protocol expects header values to be []bytes. Breaking a number into bytes might pose a slight issue for endianness, and since type information isn't passed within the headers, the receiver will only ever get bytes and will need to write code outside of the library to change it back to a number.

Maybe we should remove the number type entirely, and leave the encoding of number -> []byte to the user too. Since we can make API changes, we should probably make use of this opportunity for number. Ideally, only the Buffer type should be permitted as the header value, but I think it's very reasonable that a string can be turned into a utf8 []bytes by the library.

As for making the header an object, the keys are allowed to be repeated by the Kafka protocol. The type used by the kafkajs.d.ts is probably a good compromise, allowing key->[value] in the object that is passed.

@mkaufmaner
Copy link
Author

@milindl

The Kafka protocol expects header values to be []bytes. Breaking a number into bytes might pose a slight issue for endianness, and since type information isn't passed within the headers, the receiver will only ever get bytes and will need to write code outside of the library to change it back to a number.

Maybe we should remove the number type entirely, and leave the encoding of number -> []byte to the user too. Since we can make API changes, we should probably make use of this opportunity for number. Ideally, only the Buffer type should be permitted as the header value, but I think it's very reasonable that a string can be turned into a utf8 []bytes by the library.

As for the the number type, I think removing the number type would be appropriate and leaving the string type in place would be for the best. I also agree, that an ideal solution would be header values are only buffers. However, I also believe that only allowing buffers would be detrimental to developer experience.

As for making the header an object, the keys are allowed to be repeated by the Kafka protocol. The type used by the kafkajs.d.ts is probably a good compromise, allowing key->[value] in the object that is passed.

That would be a great compromise and the functionality to support both shouldn't be too complicated to implement.

Copy link
Contributor

@milindl milindl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a great compromise and the functionality to support both shouldn't be too complicated to implement.

Regarding this, I'll create a separate issue after we merge this in. I think this is pretty reasonable as we would not need to break any existing functionality and it's just more reasonable.

I've made changes to the development branch also which fix a bunch of issues with the header conversions and adds validation so only strings and buffers can be sent.

@@ -118,7 +118,7 @@ export interface ProducerConstructorConfig extends ProducerGlobalConfig {
}

export interface IHeaders {
[key: string]: Buffer | string | (Buffer | string)[] | undefined
[key: string]: Buffer | string | number | null | (Buffer | string | number)[] | undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the number type as we discussed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the change within the C++ side, so now an error is thrown for numeric header values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants