Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network Error Logging (NEL) #99

Open
digitarald opened this issue Jul 26, 2018 · 22 comments · May be fixed by #1141
Open

Network Error Logging (NEL) #99

digitarald opened this issue Jul 26, 2018 · 22 comments · May be fixed by #1141
Assignees

Comments

@digitarald
Copy link

Request for Mozilla Position on an Emerging Web Specification

@dbaron dbaron added the venue: W3C Specifications in W3C Working Groups label Aug 9, 2018
@dbaron
Copy link
Contributor

dbaron commented Aug 9, 2018

cc @mcmanus for his thoughts

@mcmanus
Copy link

mcmanus commented Aug 10, 2018

I looked at this a couple years ago - I had a few minor concerns about what was being reported, interactions with cors, etc.. I would have to spend some time looking them up (and can do so) They are likely addressable, the overall the idea is ok - the major value is it will give folks ways to monitor the roll out of more advanced features and therefore reduce risk and incentivize the deployment.

but the bigger concern at the time was that there was very little interest in deploying this server side other than at its sponsor google. has that changed? without wide interest its not going to incentivize deployment but will add complexity to the ecosystem.

@dcreager
Copy link

Hiya, I'm the editor of the relevant specs, and I'm happy to address any questions or concerns you might have (here or over in the spec repos).

One important point to clarify is that we factored out the report delivery portion into a separate Reporting spec (repo). Network Error Logging (repo) now only covers defining network errors (and successes) and how they map to report payloads. (Not sure if that happened after you most recently took a look.)

Report uploads should be hooked into CORS correctly (they're subject to preflights if the collector is at a different origin). The spec defines client-side failover and load balancing à la DNS SRV records. There's also now a JavaScript observer API for getting script-side access to reports — although NEL reports are explicitly excluded from being observable, to prevent leaking sensitive network reliability information.

On the NEL side, we've done some work in the last couple of months to tighten up the security and privacy constraints — for instance, by preventing DNS rebinding and subdomain policy attacks. (See w3c/network-error-logging#74 for the gory details.)

Reporting is also going to be used to deliver other types of reports than NEL — CSP is adding support for it in its next revision, and there are some predefined browser events (deprecations, interventions, crash reports) that we're defining in Reporting itself.

We have received out-of-band signals from some external developers who are ready to try this out once it goes live in Chrome, though I don't have any hard numbers on that. We're trying to minimize the effort required to adopt Reporting and NEL by having an open-source reference collector implementation available.

Hopefully this addresses some of your questions; let me know if there's anything else you want to dive into!

@ScottHelme
Copy link

It'd be awesome to see Firefox support NEL, and by extension the Reporting API.

We've added support at https://report-uri.com so hopefully that will allow site operators to enable this feature more easily without having to build their own reporting endpoint: https://scotthelme.co.uk/introducing-the-reporting-api-nel-other-major-changes-to-report-uri/

If adoption is a concern then perhaps this will give it a bump.

@dbaron
Copy link
Contributor

dbaron commented Nov 29, 2018

Also worth noting there's an explainer.

@dbaron
Copy link
Contributor

dbaron commented Nov 29, 2018

cc @ddragana @bzbarsky @martinthomson as well, for their thoughts.

The idea seems reasonable to me -- obviously some of the value depends on how widely deployed it ends up being, as @mcmanus noted above.

(And if it does become widely deployed, it's clearly advantageous for a browser to implement it because then their users are likely to get better experiences whenever any of the errors are specific to some browsers but not others.)

My initial reaction is that it seems to fit within the worth prototyping category in https://mozilla.github.io/standards-positions/ -- this presumes that (a) it seems likely to be useful and (b) there doesn't appear to be anything harmful about it.

@annevk
Copy link
Contributor

annevk commented May 14, 2019

There's a few things here that need more consideration I think:

  1. It relies on the Reporting API, discussed in Reporting API #104, which provides cookie-like tracking capabilities. https://w3c.github.io/network-error-logging/#privacy-considerations does go into this, but the mitigation of requiring secure contexts does not seem effective.
  2. It exposes network errors we've been historically uncomfortable with to expose to JS. It's not entirely clear to me how this changes those tradeoffs.

@dcreager
Copy link

For (2), NEL doesn't expose any new errors to JavaScript. The spec calls out that NEL reports are not visible to ReportingObservers.

Instead, NEL reports are only uploaded to the collectors defined by the owner of the recipient of a request. If the originator is different, they don't get to see any NEL reports about the success or failure of the outbound request.

@annevk
Copy link
Contributor

annevk commented Jul 21, 2020

To be clear, the concern is not that it exposes new errors to JavaScript, it's that it exposes new errors.

I think our stance for this should be harmful. While I think we should be supportive of reporting things that are already otherwise exposed to improve developer ergonomics, using reports for information that is not otherwise known is a lot harder to justify. Additionally, while reporting in general is now per-document, NEL is not and still has the cache problem.

(There's also the problem that none of the network errors are specified in terms of the low-level primitives defined in Fetch.)

@dcreager
Copy link

To be clear, the concern is not that it exposes new errors to JavaScript, it's that it exposes new errors.

Exposed to whom? It's not just that the errors aren't exposed to JavaScript — they're not exposed to the originator of a request through any means. The errors are only exposed to the recipient of the request, who would see the same information in their server logs for successful requests, and even for failed requests that make it past a certain point in the connection establishment process.

using reports for information that is not otherwise known is a lot harder to justify

I completely agree with this. We've tried to be very careful to not expose new information, and not expose anything to unauthorized parties. These are the principles we followed when designing NEL (from a paper we presented at NSDI back in February):

  1. We cannot collect any information about end users, their device/user agent, or their network configuration, that the server does not already have visibility into. That is, we should not collect new information relative to existing server logs; only existing information in a different place.
  2. We can only collect information about requests that user agents issue when users voluntarily access services on the Web. We cannot issue requests in the background (i.e., outside of normal user activity), even though this prevents us from proactively ascertaining service reachability.
  3. End users can opt out of collection at any time, either globally or on a per-site basis. Support for respecting opt-outs must be implemented by NEL-compliant user agents, so that users do not need to trust service providers for opt-outs to take effect.
  4. Modulo that end-user opt-out, it is only the site owner who gets to decide whether reports are collected about a particular site, and if so, where they are sent. Third parties (including browser vendors) must not be able to use NEL to monitor sites that they do not control.

Is your concern that something like NEL would be harmful even if it followed these principles? Or that NEL as currently designed doesn't follow them?

@annevk
Copy link
Contributor

annevk commented Jul 21, 2020

To be clear, I understand it's all "same origin".

It's not clear to me how NEL follows those principles. E.g., how would example.com know I cannot get to their DNS records? How could example.com identify a specific user from errors in their server logs? How would they know an IP address is invalid?

@martinthomson
Copy link
Member

I'd like to lay this one to bed, but there seems to be a thicket of issues to resolve first.

As far as @annevk's concern about exposing new information goes, I'd like to resolve that. Two things might help here:

  • What new information is being exposed? I can see that the proposal exposes information about DNS resolution and some very specific TLS errors, which might be new information. Is there anything else?

  • If the intent is not to expose this information, can we consider any release of new information as a bug in the specification that should be fixed? If so, we can ensure that offending items are identified and removed.

The potential abuse as a supercookie seems to have been resolved with the reporting API, so it would be good to confirm that the same applies here. Of course, the spec hasn't tracked reporting API changes, so it is unclear.

Those are the important items, based on the conversation.

Understanding adoption status (by sites other than Google properties as noted) would be good.

I also have a bunch of concerns about the specification itself. This hasn't tracked changes in the Reporting API and there has been no real activity on the spec in almost 2 years. So it seems like it might have been neglected a little. For example, the NEL header field is defined using defunct syntax (see RFC 8941) and it hasn't been registered in the appropriate place.

@tantek was looking to resolve this as 'harmful', which I think is fair given the conversation so far and the current state of the specification. However, good answers to the above might change that disposition.

@joseba4242
Copy link

login.microsoftonline.com uses NEL.

@paulmillar
Copy link

Just to mention it, CloudFlare appears to be using NEL.

@cdanis
Copy link

cdanis commented Jun 6, 2022

Reporting from the Wikimedia Foundation, the non-profit that maintains Wikipedia and other related projects: we use NEL, and it has been really important for detecting outages that otherwise would have either been missed or only caught due to manual user reports.

@ScottHelme
Copy link

Understanding adoption status (by sites other than Google properties as noted) would be good.

On 5th Jun 2022 there were 177,229 sites [1] serving a NEL header in the Top 1 Million Sites (list provided by Tranco [2]), which indicates that almost 18% of sites are using NEL.

At Report URI [3], we process a little over 5,000,000 NEL reports per day, with none of them coming from Google owned properties or from Cloudflare managed properties. There are also other reporting platforms out there capable of ingesting NEL reports for websites for which I don't have any data to reference publicly.

All in all, I think there's a reasonably large collection of sites out there that use NEL already and my data shows that the number of sites using it is steadily increasing.

[1] https://crawler.ninja/files/nel-sites.txt
[2] https://tranco-list.eu
[3] https://report-uri.com

@sefeng211 sefeng211 mentioned this issue May 1, 2023
@polcak
Copy link

polcak commented May 16, 2023

We are doing research on NEL.

First of all, we have analyzed HTTP Archive data on NEL deployment. The deployment raised from 0 to 11.73 % (almost 2,250,000 unique domains) since 2019. Current deployment is dominated by Cloudflare. This paper is not yet submitted and is a work in progress.

Second of all, we have focused on data protection and security issues with Network Error Logging and have an accepted paper for SECRYPT'23. Our conclusion are:

  • ePrivacy Directive regulates publicly available services and networks and applies to different technologies, not only cookies. NEL requires storing policies in the browser (W3C work in progress standard, Process policy headers, step 14). Hence, ePrivacy Directive applies. NEL is not strictly necessary: Principle 3 from the orginal NEL paper (users should be able to opt out), majority of servers does not deploy NEL. Hence, we consider that obtaining consent is necessary according to ePrivacy directive. However, the standard does not mandate obtaining consent.
  • NEL SECURITY, PRIVACY, AND ETHICAL PRINCIPLES NOT FULFILLED:
    • (Burnett et al., 2020) and (W3C, 2021) do not consider the persistency of NEL policy. Necessary to consider when MitM are temporary able to inject code (during travels, connections through unknown networks).
    • Web Workers allow deploying long-term trackers as well. However, the scope of Service Workers limited to subpaths. When content creators can inject their content, like blogs or personal web pages on shared server, NEL applies to all pages on the domain (and possibly subdomains) and allows that to track visitors on other parts of the server.
    • Some webextensions like NoScript Security Suite block Service Workers but do not alter NEL. Such users protected from MitM injecting Service Workers but not from those injecting NEL policies.
    • NEL keeps collecting data after a user deploys DNS firewall and signals the behaviour of the DNS firewall.
    • Users access services that they are not aware of (activities of European data protection authorities on online advertisement). NEL deployed in such scenarios does not track requests that users voluntarily make.

We recommend:

  • Firefox should not deploy NEL as described in current work in progress W3C standard.
  • Firefox should seek consent before accepting NEL policies.
  • Firefox should report through proxies to hide IP addresses of users (or facilitate other means to report without exposing IP addresses) to prevent exposing personal data to NEL collectors.
  • Firefox should remove policies of servers no longer announcing a NEL policy.
  • Firefox should not report full URLs to prevent exposing personal data to NEL collectors.
  • Firefox should not report to domains that the users do not access voluntarily. For example, NEL could report only on the availability of the domain currently displayed in the URL bar.

Please read the paper for more details. Do not hesitate to contact us for more information.

Edit: removed duplicate sentence fragment.

@mozfreddyb
Copy link
Contributor

This seems to be indicative that some of the issues we saw earlier are still unresolved. It's unfortunate that there was apparently not enough spec work to addresse these concerns.

@polcak Thank you for sharing this paper and your research with us.

If we want to get this triaged, I suggest we label this negative. Seems overdue.

mozfreddyb pushed a commit to mozfreddyb/standards-positions that referenced this issue May 17, 2023
mozfreddyb pushed a commit to mozfreddyb/standards-positions that referenced this issue May 17, 2023
mozfreddyb pushed a commit to mozfreddyb/standards-positions that referenced this issue May 17, 2023
@simon-friedberger
Copy link
Member

Many of the previously concerning issues have been addressed in the spec. I expect that there will be further spec changes related to the discussions at TPAC as mentioned e.g. in 105.

Assuming that privacy issues are sufficiently addressed this can be reconsidered. Some conditions would be:

  1. Sufficient flexibility to omit parts of reports or entire reports. It must be possible for clients to offer appropriate privacy controls. This is already in the spec
  2. Good indication about which data is useful for what. "request_headers" and success reports seem to have a bad cost/benefit ratio and there is not enough information available. See, e.g. 133. This is especially important for opt-out data collection.
  3. Usage of privacy preserving data collection mechanisms like OHTTP or PPM where necessary.

@zcorpan zcorpan reopened this Oct 4, 2023
@mozfreddyb
Copy link
Contributor

Agreed. Happy for us to revisit, if our original concerns are going to be resolved.

@SulemanAhmadd
Copy link

Following our (Cloudflare) discussions with Mozilla on the topic of client-side error reporting, we have compiled the following document. It aims to provide insights into the use-cases of NEL and privacy delta for each error report field consumed by Cloudflare (while keeping operational usability in mind): [SHARED] Cloudflare: NEL Usage Analysis.

Important things to note:

  • There is A LOT of value that we as infrastructure providers can capture from client-side error logs.
  • Cloudflare only processes a subset of NEL error report fields. For example, request/response headers information is not used by our deployment.
  • Each of our NEL use cases involve looking at error report trends as reported by multiple clients rather than analyzing individual isolated error reports.
  • The use of privacy-preserving technologies such as DAP (secure aggregation of multiple reports) and/or OHTTP (breaking linkage between error report and client) can help us further address some of the previously raised privacy concerns.

We hope the above document will be useful for Mozilla for reevaluating the deployment of client-side connection error logging. The hope is that it will help in understanding data exposure to what is required for operational usability based on real-world deployment. I personally believe the takeaways align really well with @simon-friedberger points above (to which the above document attempts to provide further guidance).

@valenting
Copy link

Network error logging can provide useful feedback to servers regarding network failures across different browsers. While there are certain privacy issues that need to be handled with care, implementing a subset of this specification should be enough to prove useful for both server side applications monitoring the logs, and ideally useful for the users who get their networking issues addressed.
My proposed position is: positive

We we are tracking the implementation work in bug 1145235 - though we still have privacy concerns that need to be addressed in the implementation and hopefully in a spec update.

@zcorpan zcorpan moved this from Needs proposed position to Position is proposed in standards-positions review Nov 6, 2024
@zcorpan zcorpan moved this from Position is proposed to Needs a PR in standards-positions review Nov 14, 2024
zcorpan added a commit that referenced this issue Dec 9, 2024
@zcorpan zcorpan linked a pull request Dec 9, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Needs a PR
Development

Successfully merging a pull request may close this issue.