-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network Error Logging (NEL) #99
Comments
cc @mcmanus for his thoughts |
I looked at this a couple years ago - I had a few minor concerns about what was being reported, interactions with cors, etc.. I would have to spend some time looking them up (and can do so) They are likely addressable, the overall the idea is ok - the major value is it will give folks ways to monitor the roll out of more advanced features and therefore reduce risk and incentivize the deployment. but the bigger concern at the time was that there was very little interest in deploying this server side other than at its sponsor google. has that changed? without wide interest its not going to incentivize deployment but will add complexity to the ecosystem. |
Hiya, I'm the editor of the relevant specs, and I'm happy to address any questions or concerns you might have (here or over in the spec repos). One important point to clarify is that we factored out the report delivery portion into a separate Reporting spec (repo). Network Error Logging (repo) now only covers defining network errors (and successes) and how they map to report payloads. (Not sure if that happened after you most recently took a look.) Report uploads should be hooked into CORS correctly (they're subject to preflights if the collector is at a different origin). The spec defines client-side failover and load balancing à la DNS SRV records. There's also now a JavaScript observer API for getting script-side access to reports — although NEL reports are explicitly excluded from being observable, to prevent leaking sensitive network reliability information. On the NEL side, we've done some work in the last couple of months to tighten up the security and privacy constraints — for instance, by preventing DNS rebinding and subdomain policy attacks. (See w3c/network-error-logging#74 for the gory details.) Reporting is also going to be used to deliver other types of reports than NEL — CSP is adding support for it in its next revision, and there are some predefined browser events (deprecations, interventions, crash reports) that we're defining in Reporting itself. We have received out-of-band signals from some external developers who are ready to try this out once it goes live in Chrome, though I don't have any hard numbers on that. We're trying to minimize the effort required to adopt Reporting and NEL by having an open-source reference collector implementation available. Hopefully this addresses some of your questions; let me know if there's anything else you want to dive into! |
It'd be awesome to see Firefox support NEL, and by extension the Reporting API. We've added support at https://report-uri.com so hopefully that will allow site operators to enable this feature more easily without having to build their own reporting endpoint: https://scotthelme.co.uk/introducing-the-reporting-api-nel-other-major-changes-to-report-uri/ If adoption is a concern then perhaps this will give it a bump. |
Also worth noting there's an explainer. |
cc @ddragana @bzbarsky @martinthomson as well, for their thoughts. The idea seems reasonable to me -- obviously some of the value depends on how widely deployed it ends up being, as @mcmanus noted above. (And if it does become widely deployed, it's clearly advantageous for a browser to implement it because then their users are likely to get better experiences whenever any of the errors are specific to some browsers but not others.) My initial reaction is that it seems to fit within the |
There's a few things here that need more consideration I think:
|
For (2), NEL doesn't expose any new errors to JavaScript. The spec calls out that NEL reports are not visible to ReportingObservers. Instead, NEL reports are only uploaded to the collectors defined by the owner of the recipient of a request. If the originator is different, they don't get to see any NEL reports about the success or failure of the outbound request. |
To be clear, the concern is not that it exposes new errors to JavaScript, it's that it exposes new errors. I think our stance for this should be harmful. While I think we should be supportive of reporting things that are already otherwise exposed to improve developer ergonomics, using reports for information that is not otherwise known is a lot harder to justify. Additionally, while reporting in general is now per-document, NEL is not and still has the cache problem. (There's also the problem that none of the network errors are specified in terms of the low-level primitives defined in Fetch.) |
Exposed to whom? It's not just that the errors aren't exposed to JavaScript — they're not exposed to the originator of a request through any means. The errors are only exposed to the recipient of the request, who would see the same information in their server logs for successful requests, and even for failed requests that make it past a certain point in the connection establishment process.
I completely agree with this. We've tried to be very careful to not expose new information, and not expose anything to unauthorized parties. These are the principles we followed when designing NEL (from a paper we presented at NSDI back in February):
Is your concern that something like NEL would be harmful even if it followed these principles? Or that NEL as currently designed doesn't follow them? |
To be clear, I understand it's all "same origin". It's not clear to me how NEL follows those principles. E.g., how would example.com know I cannot get to their DNS records? How could example.com identify a specific user from errors in their server logs? How would they know an IP address is invalid? |
I'd like to lay this one to bed, but there seems to be a thicket of issues to resolve first. As far as @annevk's concern about exposing new information goes, I'd like to resolve that. Two things might help here:
The potential abuse as a supercookie seems to have been resolved with the reporting API, so it would be good to confirm that the same applies here. Of course, the spec hasn't tracked reporting API changes, so it is unclear. Those are the important items, based on the conversation. Understanding adoption status (by sites other than Google properties as noted) would be good. I also have a bunch of concerns about the specification itself. This hasn't tracked changes in the Reporting API and there has been no real activity on the spec in almost 2 years. So it seems like it might have been neglected a little. For example, the NEL header field is defined using defunct syntax (see RFC 8941) and it hasn't been registered in the appropriate place. @tantek was looking to resolve this as 'harmful', which I think is fair given the conversation so far and the current state of the specification. However, good answers to the above might change that disposition. |
login.microsoftonline.com uses NEL. |
Just to mention it, CloudFlare appears to be using NEL. |
Reporting from the Wikimedia Foundation, the non-profit that maintains Wikipedia and other related projects: we use NEL, and it has been really important for detecting outages that otherwise would have either been missed or only caught due to manual user reports. |
On 5th Jun 2022 there were 177,229 sites [1] serving a NEL header in the Top 1 Million Sites (list provided by Tranco [2]), which indicates that almost 18% of sites are using NEL. At Report URI [3], we process a little over 5,000,000 NEL reports per day, with none of them coming from Google owned properties or from Cloudflare managed properties. There are also other reporting platforms out there capable of ingesting NEL reports for websites for which I don't have any data to reference publicly. All in all, I think there's a reasonably large collection of sites out there that use NEL already and my data shows that the number of sites using it is steadily increasing. [1] https://crawler.ninja/files/nel-sites.txt |
We are doing research on NEL. First of all, we have analyzed HTTP Archive data on NEL deployment. The deployment raised from 0 to 11.73 % (almost 2,250,000 unique domains) since 2019. Current deployment is dominated by Cloudflare. This paper is not yet submitted and is a work in progress. Second of all, we have focused on data protection and security issues with Network Error Logging and have an accepted paper for SECRYPT'23. Our conclusion are:
We recommend:
Please read the paper for more details. Do not hesitate to contact us for more information. Edit: removed duplicate sentence fragment. |
This seems to be indicative that some of the issues we saw earlier are still unresolved. It's unfortunate that there was apparently not enough spec work to addresse these concerns. @polcak Thank you for sharing this paper and your research with us. If we want to get this triaged, I suggest we label this |
Many of the previously concerning issues have been addressed in the spec. I expect that there will be further spec changes related to the discussions at TPAC as mentioned e.g. in 105. Assuming that privacy issues are sufficiently addressed this can be reconsidered. Some conditions would be:
|
Agreed. Happy for us to revisit, if our original concerns are going to be resolved. |
Following our (Cloudflare) discussions with Mozilla on the topic of client-side error reporting, we have compiled the following document. It aims to provide insights into the use-cases of NEL and privacy delta for each error report field consumed by Cloudflare (while keeping operational usability in mind): [SHARED] Cloudflare: NEL Usage Analysis. Important things to note:
We hope the above document will be useful for Mozilla for reevaluating the deployment of client-side connection error logging. The hope is that it will help in understanding data exposure to what is required for operational usability based on real-world deployment. I personally believe the takeaways align really well with @simon-friedberger points above (to which the above document attempts to provide further guidance). |
Network error logging can provide useful feedback to servers regarding network failures across different browsers. While there are certain privacy issues that need to be handled with care, implementing a subset of this specification should be enough to prove useful for both server side applications monitoring the logs, and ideally useful for the users who get their networking issues addressed. We we are tracking the implementation work in bug 1145235 - though we still have privacy concerns that need to be addressed in the implementation and hopefully in a spec update. |
Request for Mozilla Position on an Emerging Web Specification
The text was updated successfully, but these errors were encountered: