Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple attribution domains under one eTLD+1 #115

Closed
shigeki opened this issue Mar 8, 2021 · 8 comments
Closed

Multiple attribution domains under one eTLD+1 #115

shigeki opened this issue Mar 8, 2021 · 8 comments
Labels
inactive? Issue may be inactive

Comments

@shigeki
Copy link
Contributor

shigeki commented Mar 8, 2021

Note that words here follow old terminology.

In Chrome 89, a conversion destination is stored as a schemeful site of eTLD+1.
We have multiple service domains under one eTLD+1, and this change affects our conversion measurements.

In the following examples, example.jp has two services of shopping.example.jp and travel.example.jp, and each service has its own impression and conversion URLs.
As the figure below shows, its credit of 100 is attributed to the impression of travel.example.jp regardless of conversions on only order.shopping.example.jp.
issue
It is better to set the FQDN of conversion origin in the conversion destination and select reporting conversions with the conversion origin to resolve this issue. But it might increase user tracking risk.

Alternatively, I think that we can introduce a new attribution of domain id to separate impressions/conversions under eTLD+1 and reduce the maximum impression data size to keep its entropy.

Or can #114 solve this issue by filtering conversion data?

@csharrison
Copy link
Collaborator

Thanks for filing! We moved from origin-based attribution scoping on the destination to eTLD+1-scoped attribution to allow for landing pages and conversion pages to be on separate origins. This helps in cases where landing pages look like shoes.com but conversions happen e.g. on purchase.shoes.com for instance.

Introducing an opt-in for tighter attribution scoping is useful. In the API currently there is one way to do this. Since attribution is scoped to a <attributeon, reportto> pair, you can shard the reporting origin per each separate destination. For example, you could have https://travel.example.jp be the configured reporting origin for travel and https://shopping.example.jp but the configured reporting origin for shopping.

I am not opposed in general to adding other mechanisms of opting in (like the conversion-filters proposal). This may also be something that is configurable in an attribution worklet (issue #114).

@shigeki
Copy link
Contributor Author

shigeki commented Mar 8, 2021

In our current origin trial, we have only one domain for reporting since it is shared with our production ad services, so we will evaluate how much this issue affects our conversions through trials.
The attribution worklet seems to be more flexible, and we are looking forward to having it. Thanks.

@johnwilander
Copy link

johnwilander commented Mar 8, 2021

Note the attack I described in privacycg/private-click-measurement#57 (the same analysis goes for the destination website):

Why Not Attribution Reports To Subdomains?

Some have requested that attribution reports be sent to the full domain of the site where the click happens and similarly the full domain of the site where the conversion happens.

Neither of these meet our privacy requirements. In both cases, subdomains can be chosen to convey further information about the click or conversion.

Imagine for instance social.example where the ad click happens making sure the site is loaded from the subdomain johnwilander.social.example when I'm logged in there and from the subdomain janedoe.social.example when Jane Doe is logged in. That would take us back to cross-site tracking in the subsequent report.

The reason for restricting PCM reports to registrable domains is that the scheme+registrable domain, a.k.a. schemeful site, is the only part of a URL that is free from link decoration. All other parts can be made user specific, including subdomains.

You could of course imagine social.example setting up a registrable domain per user, such as johnwilander-social.example, and load the whole website from that domain when I'm logged in to get back to cross-site tracking of clicks. If that happens, we'd have to deal with it but at least the user has a chance to see that a personalized domain is used through the URL bar.

@shigeki
Copy link
Contributor Author

shigeki commented Mar 8, 2021

@johnwilander Thanks for your explanations. I could not think of the privacy risk of link decoration and agree with it.

@nightpool
Copy link

The reason for restricting PCM reports to registrable domains is that the scheme+registrable domain, a.k.a. schemeful site, is the only part of a URL that is free from link decoration. All other parts can be made user specific, including subdomains.

Is this actually true? Couldn't a site gain e.g. 8 more bits of entropy by registering 256 public domains and funneling their conversions through those domains based on some sort of user ID? I'm not 100% certain on whether this would be a viable attack on this protocol, since I don't understand the threat model entirely, but it does seem to have similar problems at scale to the subdomain issue you talk about.

@johnwilander
Copy link

The reason for restricting PCM reports to registrable domains is that the scheme+registrable domain, a.k.a. schemeful site, is the only part of a URL that is free from link decoration. All other parts can be made user specific, including subdomains.

Is this actually true? Couldn't a site gain e.g. 8 more bits of entropy by registering 256 public domains and funneling their conversions through those domains based on some sort of user ID? I'm not 100% certain on whether this would be a viable attack on this protocol, since I don't understand the threat model entirely, but it does seem to have similar problems at scale to the subdomain issue you talk about.

If you want to discuss PCM, you can use its repo here: https://github.com/privacycg/private-click-measurement/issues

On this specific issue, there are three important differences between different subdomains on a single eTLD+1, and different eTLD+1s:

  • Subdomains can share cookies whereas the eTLD+1s cannot.
  • If the user re-engages a couple of days later, they won't end up on the specific eTLD+1 set up for them without extra tracking powers or pure luck (they kept the tab and found it).
  • eTLD+1 typically carries branding. If you land on a randomized domain to buy something, you won't have brand recognition and you might get suspicious that this is a scam.

@rob123ui
Copy link

@johnwilander

@csharrison
Copy link
Collaborator

Note: since we have the conversion filters feature, we have partial support for this use-case in that you can selectively drop conversions if they do not match. This is only partial support because we don't selectively attribute, for privacy related reasons. However, that is being reconsidered in #523. Let's follow-up there if that support is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive? Issue may be inactive
Projects
None yet
Development

No branches or pull requests

6 participants