-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use more efficient crossfilter range/value filters #478
Comments
Good optimization. This will need some testing prior to acceptance. |
@mtraynham, have you looked into this further? I believe we've run into performance problems because dc.js is always using filterFunction. It's especially a problem when brushing and you want the fastest response time there. |
@gordonwoodhull, I took it one step further actually. I forked Crossfilter and added functionality that would let me better utilize filtering. My fork of Crossfilter included Jason Davie's union filter as well as a patch to identify what kind of objects are in the dimension (obj, array, prim). This allowed me to offload to Crossfilter as much as I could in the base filterHandler of dc.js: I also removed any references to the RangeFilter in dc.js as Crossfilter supports simple ranges. The code below has three cases:
I'm actually not using this code anymore, since I've found other means to do filtering on server, but here it is. I don't want to say it, but Crossfilter is somewhat dead... so it might be worth considering forking for dc.js.
My patches for Crossfilter was only two commits, https://github.com/mtraynham/crossfilter |
Thanks @mtraynham, that is very intriguing. Yes, it does seem that crossfilter is not actively maintained - to be kind, you could call it "complete" rather than "dead". ;-) Forking it to the dc-js organization is something to seriously consider. That would be taking on some responsibility though. Another direction to consider is that dc.js is not tied very tightly to crossfilter. (I think there are not many calls into its API beyond Wow, you did composite dimensions as well. Can't wait to look into that! |
Yeah, it was pretty interesting. The biggest thing was trying to figure out what was performant. Crossfilter's heap sort it uses can be slow for items that are similar in comparison, such as lengthy arrays. But how exactly do you sort an array of arrays, atleast from Crossfilter's point of view? From some performance testing, it should be avoided and use of the filterFunction was key. Crossfilter sorting can be avoided if you pass in objects...they don't recognize coercion and sorting is almost ignored. When building charts, we designed a scheme that had dimensions, non-filter dimensions, and groups. When it came to non-filtered dimensions, we created a generic reducer that would bubble up. Basically wrapping reducers with other reducers... The dimensional reducers would cache the values, so there was only one crossfilter lookup. But using some tricks we just created these crazy reducer objects built from a chain of reducers and flattened them out in a overrided .data() function. That way you could just assign accessors to the 'columns'... Even though it all worked, our long pole was the original data download.. we were just pushing too much over the wire and it caused long load times. For instance:
PourOver would be cool. Our server side solution is very similar in nature. |
Retitling because I keep trying to find this. |
one idea in #988, devolving the chart registry, is to sort of "render" the filter into a query rather than always generating a filter function as we're doing now. |
Even though Crossfilter does give us the dimension.filterFunction capabilities, the code should still try and use dimension.filter whenever possible. In the case of brush ranges, crossfilter can use it's bisect function (which binary searches the values) and processes add/remove on portions of the index, rather than the whole index.
Adding the filters.length === 1 conditional to dc.baseMixin's _filterHandler would work accordingly:
The text was updated successfully, but these errors were encountered: