move parser down to connector to avoid duplicate serialization overhead #4961

tabVersion · 2022-08-29T07:57:52Z

Background

When using a connecter that generates data internally, eg. datagen and nxtmark, the existing implementation need to serialize the generated data to json and deserialize it to datachunk in the upper layer, which causes unnecessary overhead.

Design

in prev design for datagen, connectors are informed with columns and schema, there is no difficulty in generating datachunk directly.

Future Optimizations

No response

Discussions

No response

Q&A

No response

lmatz · 2022-11-23T09:13:37Z

Needed by stream executor's micro-benchmarks (executed in e2e style), i.e. prevent datagen source from becoming a bottleneck.

tabVersion added the type/feature label Aug 29, 2022

tabVersion self-assigned this Aug 29, 2022

fuyufjh added this to the release-0.1.13 milestone Aug 31, 2022

lmatz added performance type/perf and removed performance labels Sep 5, 2022

lmatz mentioned this issue Sep 6, 2022

overhead from json serde #5122

Closed

tabVersion assigned TennyZhuang and unassigned tabVersion Sep 7, 2022

fuyufjh modified the milestones: release-0.1.13, next-release-0.1.14 Sep 26, 2022

fuyufjh modified the milestones: release-0.1.14, release-0.1.15 Nov 23, 2022

fuyufjh modified the milestones: release-0.1.15, release-0.1.16 Dec 19, 2022

lmatz mentioned this issue Dec 20, 2022

perf(connector): generate native Row when datagen to avoid json deser #6969

Closed

3 tasks

tabVersion mentioned this issue Dec 23, 2022

Tracking: migrating parser to byte stream based trait #7032

Closed

6 tasks

TennyZhuang modified the milestones: release-0.1.16, release-0.1.17 Jan 30, 2023

TennyZhuang closed this as completed Feb 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move parser down to connector to avoid duplicate serialization overhead #4961

move parser down to connector to avoid duplicate serialization overhead #4961

tabVersion commented Aug 29, 2022

lmatz commented Nov 23, 2022

move parser down to connector to avoid duplicate serialization overhead #4961

move parser down to connector to avoid duplicate serialization overhead #4961

Comments

tabVersion commented Aug 29, 2022

Background

Design

Future Optimizations

Discussions

Q&A

lmatz commented Nov 23, 2022