-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change representation of partition in FileScanConfig #4295
Comments
There is likely some overlap with #2293 I personally don't think we should have the concept of a partition at all, and should instead have a smarter work scheduler, but I haven't been able to work on that recently |
Yep, having partitions seem to be a limiting factor right now. There are two things on my plate right now:
In both of these points replacing somehow partitions with single queue would be helpful for me. But I understand that It might not be a priority or good enough solution for the project right now. Anyway the concept of partition seems to sit pretty deep in codebase, I saw that It is passed through hierarchy of ExecutionPlan's I wonder what kind of scheduler do you have in mind ?
|
The scheduler I started work on preserved the concept of partitions, but did not rely on them for work distribution, or at least wouldn't have if I had actually finished it 😅
Yes, the hope was to gradually change to a push model for operators where it is possible
IMO fairness is better handled at a higher level, e.g. with separate query pools or even separate query processes. The scheduler should focus on throughput at the expense of fairness, if nothing else fairly multiplexing queries is a recipe to blow your memory budget. |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Please correct me if I am wrong, but from what I understand each partition from FileScanConfig (file_group) is executed sequentially. That means if there is large disproportion of work that needs to be done (e.g. part A 10 files 10MB, part B 10 files 10GB), then query will take as long as largest partition requires to get done.
Describe the solution you'd like
I would like implement work stealing by e.g. sharing emitter of PartitionedFile among FileStream's.
Possible implementations:
Vec<Vec<PartitionedFile>>
} -> { file_groups:Vec<Box<dyn Partition>>
}, that way we keep existing interface very similar to what we have now. I would be able to make n virtual partitions that internally point to single partition.Vec<Vec<PartitionedFile>>
} -> queue/stream of files that can be shared among n workers (FileStream's, heads up naming collision)The text was updated successfully, but these errors were encountered: