-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting batch uploads from the client (and routing reports through the collector) #64
Comments
Can you elaborate on this? Is the idea here that when a client (that is, a real end user client, not a batching one) gets a 200 OK after posting a report to a batching client, the client can be assured that its report has been durably persisted somewhere? I think a leader server could provide a similar guarantee at the end of the upload phase so i'm trying to understand what extra assurances the batching client provides.
IIUC the query flexibility is because the batching client can submit the same reports multiple times, in different-sized batches. If we decide this query flexibility is bad, we could mitigate this by having the original client include a report timestamp in the encrypted input, where it can't be tampered with by the batching client. Aggregators would then maintain query/privacy budgets per aggregation window and would be able to refuse queries on reports that fall in an aggregation window whose budget is already spent. |
I think the use-case is more that the collector is assured the system is working without requiring an interaction with the helpers. It is possible this case could be met by introducing some "do I have some reports" functionality though.
I think this is one part of it. There are a few ways this batching introduces flexibility even if reports can only be queried once. Mainly this is via separating / combining reports across multiple in-the-clear dimensions (in our design we give some info in the clear like the advertiser site a user converted on). A collector could combine multiple small advertisers reports together if they are too small to receive aggregate data. This is recoverable with a robust query model in the helpers though it adds complexity. Another example along these lines is time-based querying. One collector might want data on hour boundaries, another might want on 4 hour boundaries etc. |
Closed the PR, but here's where we left the discussion: #78 (comment) |
Seems like the protocol already has everything needed to address this issue. In a combined Collector-Leader deployment, the details of the upload protocol in the spec can probably just be disregarded. What matters for interop in that case is the aggregation and collection flows running between Leader and Helper. Closing as "won't fix". @csharrison please feel free to re-open if there's more to discuss. |
In today's design call we discussed the collector receiving encrypted reports from clients and forwarding them to the leader. This aligns with the design we have in the WICG with some of the reasoning documented here.
I also brought this up for discussion on our regular calls in the WICG (minutes). Where there was some agreement that this was a good idea.
Pros / Cons of routing reports through the collector
These are probably non-exhaustive.
Pros:
Cons:
Protocol solution
It seems there is a fairly simple solution to this problem, and that is to simply:
In the existing protocol there is no client authentication so it is technically possible to have a collector that just collects encrypted reports from clients and forwards them on to the leader. Of course the actual clients would need to be set up to do this but it is permitted by the protocol. By allowing batch uploads we just optimize this already-permitted configuration.
Alternatively, if we deem collector-clients to be bad for the protocol, we ought to have a mechanism which actually forbids them (e.g. by authenticating clients). However, I think that it is pretty reasonable to have this allowed by the protocol and leave it up to specific instantiations how the "client" is configured/trusted.
The text was updated successfully, but these errors were encountered: