Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move parser down to connector to avoid duplicate serialization overhead #4961

Closed
tabVersion opened this issue Aug 29, 2022 · 1 comment
Closed

Comments

@tabVersion
Copy link
Contributor

Background

When using a connecter that generates data internally, eg. datagen and nxtmark, the existing implementation need to serialize the generated data to json and deserialize it to datachunk in the upper layer, which causes unnecessary overhead.

Design

in prev design for datagen, connectors are informed with columns and schema, there is no difficulty in generating datachunk directly.

Future Optimizations

No response

Discussions

No response

Q&A

No response

@lmatz
Copy link
Contributor

lmatz commented Nov 23, 2022

Needed by stream executor's micro-benchmarks (executed in e2e style), i.e. prevent datagen source from becoming a bottleneck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants