-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Geometry_Engine: Update PointClustersDBSCAN to use DomainTree #1888
Geometry_Engine: Update PointClustersDBSCAN to use DomainTree #1888
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works fine, LGTM, but two questions ring in my head:
- before the proposed change,
PointClusetersDBSCAN
was and example of implementation of the generic DBSCAN algorithm, while now it does not have anything to do with it any more - if that can work on vectors in 3-dimensional space, couldn't it be generalized a bit cheaply?
These translate into: couldn't the tree become a part of the generic DBSCAN algorithm? I have read the PR description, but still, that would be soo nice!
It doesn't have anything to do with that specific method in Data_Engine, but it is still using the DBSCAN algorithm
I agree that that would be super nice, so I'll give it another go and see if I can be less annoyed and/or put in the extra work to make it feasible |
So it is still annoying. To make use of something similar to the existing version one would have to copy the tree structure over with an other collection of data, and iterate over that tree, and make sure the queries does not remove the object reference. First one can't be done with the interface as it is structured as it would by necessity return an interface, which breaks everything as it's recursively defined. The second one is possible but exceedingly annoying without making it an So to copy the one previously here where I keep the incidences in a tree. This does not work for the interface as it's kind of a special case that it can be traversed without the data present, which would need to be exchanged for So we got a method for any thing which one can construct a Will add documentation if this is general enough. @pawelbaran |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works nice and super fast, I like the way of generalization (could possibly be used to other data types than 3-dimensional goemetry - correct me if I am wrong @kThorsager).
Happy to approve once the 2 code comments are replied 👍
Func<Point, Point, bool> metricFunction = (x, y) => x.SquareDistance(y) <= maxSqrDist; | ||
return points.ClusterDBSCAN(metricFunction, minPointCount); | ||
double sqDist = maxDist * maxDist; | ||
Func<Point, DomainBox> toDomainBox = a => a.IBounds().DomainBox(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we simply create a DomainBox
without calling a chain of methods here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, (but it's a bit ugly)
And keep track of the items to evaluate through their indecies as opposed to the items themself
Yes, anything which one can search for in a DomainTree and which can be clustered by its abstract DomainBoxes proximity (which is in a arbitrary dimension). The one thing is that this only supports the one Create method for the tree, DomainBoxes encompassing DomainBoxes, which should be fine. But then again, who knows how this might be used one day. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, works faster than GH! :D
/azp run BHoM_Engine.CheckInstaller |
Azure Pipelines successfully started running 1 pipeline(s). |
Issues addressed by this PR
Closes #1881
Had some hopes about making this general for DBSCAN and making them use
INode
, but there were annoyances as it i not aIEnumerable
, there are difficulties in changing the objects in the tree without messing with the data and the DBSCAN method would need to be refactored to look at things per Collection rather than per Item.And just adding clusters for IGeometries or ICurves was also annoying since no general distance method existed and would add a extra layer to this method. And there doesn't seem to be any huge need either so.
But the speed up is considerable, especially for large sets of points. 😁
(40000 points, 1.3m to 3.4s in the test file)
Test files
https://burohappold.sharepoint.com/:u:/s/BHoM/Ec8Vk7lcL1dCgXq-QdewJokBfJ4s8jwT5hIATIwiA-3WWg?e=RqtVBu
Changelog
Additional comments