-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: make canViewerDeleteAsync recursive #224
Conversation
680e771
to
49ec9b7
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #224 +/- ##
===========================================
+ Coverage 99.89% 100.00% +0.10%
===========================================
Files 68 69 +1
Lines 1831 1892 +61
Branches 244 265 +21
===========================================
+ Hits 1829 1892 +63
+ Misses 2 0 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
6a13c6f
to
93a1807
Compare
93a1807
to
529741d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to block this PR but I think future use cases we need to be careful about are:
-
entities that have large numbers of references, like when many build entities reference the same account or the same user. Possible solutions are: chunking, and allowing some entities to bypass needing to be loaded to perform these checks. For instance, some entities like update-asset edges might have privacy policies that don't depend on both the viewer and the specific entity instance being deleted -- we don't need to load the actual update-asset edge entity to know its privacy policy outcome. Or phrased differently, what if privacy policy methods were split into InstancePrivacyPolicies and ClassPrivacyPolicies? And possibly ViewerAgnostic versions that don't get a viewer context? Can explain in person if this is confusing.
-
entities that reference multiple other entities - builds reference both accounts and users, which means that if we delete a user and their primary account, we might check 2x whether we can delete a build, one time when we delete the user and one time when we delete the account. Memoizing the can delete/can update results might be the solution, but comes with the messiness of introducing a cache.
|
||
// Take entity X which is proposed to be deleted, look at inbound edges (entities that reference X). | ||
// These inbound edges are the entities that will either get deleted | ||
// or updated with null based on the edge definiton when entity X is deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// or updated with null based on the edge definiton when entity X is deleted. | |
// or updated with null foreign keys based on the edge definition when entity X is deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to roughly what is suggested, though foreign keys are a relational concept while entity is theoretically db-agnostic so I used a different phrasing. Let me know if you think it still needs alteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I wanted to clarify is that the FK fields of the referencing entities will be set to null. Basically, for "Updated with null" to be more precise and say what exactly will be set to null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated again. Thanks for the clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: future use cases we need to be careful about
Yep, these are definitely considerations (mentioned them in the summary as well).
Agree that for certain classes of entities, it's sufficient to check permission on a single instance to know whether a group of entities loaded via an edge are authorized, but it's somewhat difficult to express that in the framework. For the update-asset edge example, the case is we just need to check if they all have the same owning app ID and then infer that they can be accessed. But this is a very application-specific constraint. Expressing it more generally would still require checking at least one instance and then also providing a uniformity function. Though this would still require loading all the entities and would only bypass extra privacy policy checks. Authorizing all nodes on the other end of an edge without loading the edges themselves is hard to express outside of application code. I assume this is the ClassPrivacyPolicies
concept you were ideating.
For entities that reference multiple other entities, I think it's best to not memoize since
a) the policies are called with different arguments (different cascading delete reasons in particular)
b) the dataloader hopefully makes authorization-through-edges fairly fast
My plan is to implement useful utilities for doing these batch deletions here in the entity repo and then do the application-specific logic in the expo server code. I'm not too sure where the boundary between the two will be yet.
|
||
// Take entity X which is proposed to be deleted, look at inbound edges (entities that reference X). | ||
// These inbound edges are the entities that will either get deleted | ||
// or updated with null based on the edge definiton when entity X is deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to roughly what is suggested, though foreign keys are a relational concept while entity is theoretically db-agnostic so I used a different phrasing. Let me know if you think it still needs alteration.
Loadless privacy policy evaluation: I was thinking about entities where the privacy policy doesn't depend on the entity itself ("only super users can edit this") or the doesn't depend on the viewer ("everyone can view this public object"). In the former case, we don't need to load the entity (hence a class method instead of instance method). And we'd be able to coalesce privacy policy checks (if we have 100 entities of the same type and the privacy policy is a class method, we can just evaluate the privacy policy once and know the answer for all 100 entities). Memoization: That's a good point about cascading delete reasons being different. |
Why
canViewerDeleteAsync
historically just checked the privacy policy on the single entity, but didn't check the cascading deletion behavior of all entities that were being deleted. This could result in false-positives and was more an approximation of whether an entity could be deleted.This PR fixes this bug by recursively traversing the deletion cascades in the same way that EntityMutator does and returning the conjunction of the can deletes.
How
In addition to updating the behavior, this moves the functions out of the entity
this
type for class inference and makes the invocation explicit. This is so the function can be used recursively on a unknown entity type (whereas calling functions with explicitthis
parameter type need to be called on a concrete type to typecheck correctly).Test Plan
Ensure tests pass.
Next steps
One pattern we're seeing more of is needing to process large amounts of data during deletions. I'm still figuring out how to codify this in the entity library, but likely we'll want something like the following:
In background job (not in web request since it processes too much data):
canViewerDeleteAsync
for object X which we want to delete asynchronously. W may need some sort of memory limiter likepromise-limit
.