Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support tracking updated aggregated states #465

Conversation

yl-lisen
Copy link
Collaborator

@yl-lisen yl-lisen commented Jan 2, 2024

This close #446

Please write user-readable short description of the changes:

  • Add updates tracking data (tracking changes/ updates .etc)
  • Unified versioned serialize and deserialize of aggregator
  • Remove some useless implementation in Aggregator

@yl-lisen yl-lisen self-assigned this Jan 2, 2024
@yl-lisen yl-lisen force-pushed the feature/issue-446-support-tracking-updated-aggr-windows-or-groups branch from 44296ee to ddb9432 Compare January 7, 2024 16:50
@yl-lisen yl-lisen force-pushed the feature/issue-446-support-tracking-updated-aggr-windows-or-groups branch from 8f614ef to d873f04 Compare January 22, 2024 15:26
@yl-lisen yl-lisen force-pushed the feature/issue-446-support-tracking-updated-aggr-windows-or-groups branch from d873f04 to b2fcb97 Compare January 30, 2024 06:45
@yl-lisen yl-lisen marked this pull request as ready for review January 30, 2024 06:45
@yl-lisen yl-lisen force-pushed the feature/issue-446-support-tracking-updated-aggr-windows-or-groups branch from b2fcb97 to b6faf7a Compare January 30, 2024 07:13
@yl-lisen yl-lisen requested a review from chenziliang January 30, 2024 07:14
@yl-lisen yl-lisen force-pushed the feature/issue-446-support-tracking-updated-aggr-windows-or-groups branch from b6faf7a to 9c361a3 Compare January 30, 2024 07:36
@chenziliang
Copy link
Collaborator

Summary,

  1. Always highly recommended move the refactor part to a different PR especially the current one is critical enough, that will make the PR / review process way more efficient
  2. I didn't see the importance / necessity of breaking prepareBlockAndFill into 3 pieces so far. If we really do need, a separate PR will be way clearer.
  3. We may be able to clean up duplicate semantics like AggregateStateType vs TrackingUpdatesType
  4. We may be able to clean up no_more_keys in this PR or in future PR
  5. Avoid template as much as possible unless it has real benefits and we will be blowing our binary size / compile time quickly and also harder to maintain / debug. Simple / direct code is usually better.
  6. Prefer writeVar for serde since they save disk / network bandwidth and fast as well. Avoid using writeText since they are slow and big.
  7. Let's start to comment more as much as possible. Current code base is kinda of slowly out of control regarding what each function does, what they mean. A bit too many branches / conditions to understand. Without documentation / comments, it will be very hard for different people to pick up.

@yl-lisen
Copy link
Collaborator Author

yl-lisen commented Feb 2, 2024

Summary,

  1. Always highly recommended move the refactor part to a different PR especially the current one is critical enough, that will make the PR / review process way more efficient
  2. I didn't see the importance / necessity of breaking prepareBlockAndFill into 3 pieces so far. If we really do need, a separate PR will be way clearer.
  3. We may be able to clean up duplicate semantics like AggregateStateType vs TrackingUpdatesType
  4. We may be able to clean up no_more_keys in this PR or in future PR
  5. Avoid template as much as possible unless it has real benefits and we will be blowing our binary size / compile time quickly and also harder to maintain / debug. Simple / direct code is usually better.
  6. Prefer writeVar for serde since they save disk / network bandwidth and fast as well. Avoid using writeText since they are slow and big.
  7. Let's start to comment more as much as possible. Current code base is kinda of slowly out of control regarding what each function does, what they mean. A bit too many branches / conditions to understand. Without documentation / comments, it will be very hard for different people to pick up.

Thanks, these are great suggestions.
I'm so sorry, i've separated part of the implementation, but it's still too big, I will separate the refactor parts in next PR

 * remove compile aggregate functions for streaming query now
 * remove no_more_keys
 * remove overflow_rows
 * move out refactor code for retract impl
@yl-lisen yl-lisen force-pushed the feature/issue-446-support-tracking-updated-aggr-windows-or-groups branch from 9c361a3 to 22a14c5 Compare February 2, 2024 17:59
@yl-lisen yl-lisen requested a review from chenziliang February 2, 2024 18:02
 * rename UpdatesTrackingData.h to TrackingUpdatesData.h
 * use temp arena instread of shared ptr of arena
@yl-lisen yl-lisen force-pushed the feature/issue-446-support-tracking-updated-aggr-windows-or-groups branch from bcd8479 to 7a78f20 Compare February 3, 2024 07:46
@yl-lisen yl-lisen merged commit 5dc02c0 into develop Feb 3, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support tracking whether the aggregation window or group is changed
3 participants