Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crawl follows <link rel=alternate> links in <head>, surprisingly prerendering rss feed generating endpoints #5079

Closed
Bluebie opened this issue May 26, 2022 · 1 comment · Fixed by #6977
Labels
bug Something isn't working p2-nice-to-have SvelteKit cannot be used by a small number of people, quality of life improvements, etc.
Milestone

Comments

@Bluebie
Copy link

Bluebie commented May 26, 2022

Describe the problem

Currently, when crawl is enabled, build will crawl seemingly any tag with a href, src, or srcset attribute, and does not contain a rel="external" attribute. Normally, these pages are prerendered:

if (href && !/\bexternal\b/i.test(rel)) {

As a result, using an endpoint to generate dynamic data, like an rss/atom/json-feed will cause the feed endpoint to be prerendered and the resulting feed on the site to be static and frozen. That might be desirable for static site rendering, but for dynamic sites, it's a problem. It's also surprising to encounter this, because endpoints (may change with: #4093) do not seem to support the export const prerender = bool option. It appears the only way to ensure endpoints aren't prerendered and frozen like this is to set rel="external", disable prerendering/crawling entirely, or totally avoid outputting any tags with href/src/srcset attributes pointing to endpoints in html.

While it maybe feasible to use <link rel="alternate external" href="/path/to/feed.atom"> it feels semantically wrong because it is not an external resource, and I worry if we can really trust feed readers to have such a nuanced understanding of rel attribute semantics.

Describe the proposed solution

I'm not sure what the right answer is, but here are some ideas:

  1. Don't crawl anything in <head>/<svelte:head>? Would this break any useful workflows? do people dynamically generate stylesheets or social media open graph type of resources through <head>?
  2. Adjust crawler to skip rel="alternate" specifically to support feeds usecase? Would this break static site builders where a static prerendered feed maybe desirable?
  3. Support export const prerender = false in endpoints, and have that behaviour somehow refuse the page's inclusion in the prerender output?
  4. Implement a second rel attribute that functions the same as "external" for svelte crawler purposes but has no other existing semantics on the web, so we can control crawler at the link level without having undesirable effects on search engines.

Maybe a composite answer is good. Is it surprising that the crawler follows <link>'s in head? I think so. Maybe crawling of <link> tags should be opt in? <link rel="stylesheet svelte-prerender" href="/styles-compiler-endpoint.css"> maybe? Maybe it should be opt in in dynamic site builds but default on with fully static adaptors? But that feels like a messy answer.

Alternatives considered

No response

Importance

would make my life easier

Additional Information

No response

@Rich-Harris Rich-Harris added this to the 1.0 milestone Jul 15, 2022
@benmccann benmccann added p2-nice-to-have SvelteKit cannot be used by a small number of people, quality of life improvements, etc. breaking change bug Something isn't working and removed breaking change labels Jul 19, 2022
@benmccann
Copy link
Member

I've never liked that you have to use external for something you don't want to be a client-side navigation even if it's not actually external to your domain. It could be nice to introduce a new attribute for this

Marking endpoints as prerenderable is something we're considering: #4093

I also wonder about your question of skipping link rel="stylesheet" or perhaps any links in the _app directory. Neither of these seem like they'd solve your problem though, but might be nice optimizations. We could go further and skip all link tags, but then that seems like it'd have the potential to break something, so I don't know that I'd go that far.

dummdidumm pushed a commit that referenced this issue Sep 23, 2022
…6977)

Fixes #5079. Most of the pieces were in place, but currently SvelteKit errors if you have a <link> in a prerendered page that points to a non-prerenderable +server.js file.

With this PR, if the crawler hits those routes, it won't error but nor will it prerender the route.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p2-nice-to-have SvelteKit cannot be used by a small number of people, quality of life improvements, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants