Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breadcrumbs SEO microdata includes invalid values for IDs if no href #7241

Closed
6 of 7 tasks
fnune opened this issue Apr 25, 2022 · 17 comments
Closed
6 of 7 tasks

Breadcrumbs SEO microdata includes invalid values for IDs if no href #7241

fnune opened this issue Apr 25, 2022 · 17 comments
Labels
bug An error in the Docusaurus core causing instability or issues with its execution closed: working as intended This issue is intended behavior, there's no need to take any action.

Comments

@fnune
Copy link

fnune commented Apr 25, 2022

Have you read the Contributing Guidelines on issues?

Prerequisites

  • I'm using the latest version of Docusaurus.
  • I have tried the npm run clear or yarn clear command.
  • I have tried rm -rf node_modules yarn.lock package-lock.json and re-installing packages.
  • I have tried creating a repro with https://new.docusaurus.io.
  • I have read the console error message carefully (if applicable).

Description

If a breadcrumb item (such as a category item in the sidebar) has no link (and thus no href) the generated SEO microdata is invalid. Since the generator is designed to pick an href and use it as the id, it looks like if there's no href to take it default to whatever text is visible inside the element.

So, if you have, e.g. a sidebar with nested items:

image

Then the generated breadcrumbs have no links associated to them. The generator takes, in this case "MCU Guides" as its ID, which is apparently invalid:

image

In these cases, I think it would be appropriate to not add an ID property at all.

Reproducible demo

No response

Steps to reproduce

  1. Add a nested sidebar to Docusaurus:
// E.g. in sidebar.js
module.exports = {
    docs: {
        "Some Category": {
            type: "category",
            label: "Some Label",
            items: [
                "some-category/actual-docs"
            ]
        }
    }
};
  1. Run the Google rich results validator on the resulting page: https://search.google.com/test/rich-results

Confirm that the ID of the SEO microdata is Some Category, which is invalid.

Expected behavior

No ID gets added in the generated SEO microdata for breadcrumbs without an href.

Actual behavior

An invalid ID gets added, using the text value from the category breadcrumb, because there's no href to take from.

Your environment

Self-service

  • I'd be willing to fix this bug myself.
@fnune fnune added bug An error in the Docusaurus core causing instability or issues with its execution status: needs triage This issue has not been triaged by maintainers labels Apr 25, 2022
@Josh-Cena
Copy link
Collaborator

Hi, we're aware of this issue. What do you propose to solve this (because we didn't add id but Google insists to infer it)? I'm pretty out of ideas, and I'm tempted to say this is wontfix, that if you expect the SEO metadata to be fully valid, you'd better add an href (which is what you are supposed to do for breadcrumbs anyways)

@Josh-Cena Josh-Cena added status: needs more information There is not enough information to take action on the issue. and removed status: needs triage This issue has not been triaged by maintainers labels Apr 25, 2022
@fnune
Copy link
Author

fnune commented Apr 25, 2022

Ultimately I believe there's a mismatch between Docusaurus' model of breadcrumbs and Google's expectation. It seems like in the eyes of Google a breadcrumb should only exist if it links to something*. However this isn't (please correct me if I'm wrong) the Docusaurus model: for Docusaurus, a category linking to nothing is a valid breadcrumb.

I have (AFAIK) no way to control Docusaurus' breadcrumb generation, so I can't really add an href to my category breadcrumbs. This leaves me with only one practical solution: to remove the breadcrumbs altogether (which I think I can do by configuring docusaurus-plugin-content-docs).

* My personal view: a breadcrumb without a link isn't really a breadcrumb and shouldn't be added to the page. Each breadcrumb is meant to represent a page the user has visited to get to their current page. The user never visited a category page, so there should be no item for that. This would also solve the microdata issue.

@fnune
Copy link
Author

fnune commented Apr 25, 2022

Alternatively I could add an actual page for each category, that just links to items in the category. But the issue still stands that the breadcrumbs generated by Docusaurus don't represent clickable pages in all cases (and IMO they should).

@fnune
Copy link
Author

fnune commented Apr 25, 2022

@Josh-Cena
Copy link
Collaborator

Josh-Cena commented Apr 25, 2022

I have (AFAIK) no way to control Docusaurus' breadcrumb generation,

@fnune You can. The breadcrumbs are generated from the sidebar structure, so as long as each sidebar category has an index page, the breadcrumb will also have an id. Would you be happy enough if we simply add a notice that if you care about SEO, you should add an index page for each category?

Note that the Google console message is only a warning, and I believe SEO won't be undermined by this.

Again, I'm fully aware (all the way back when I sent #6697, I even noted in the PR description), I do believe this is terrible, but I don't know of a way to fix this. If you do, please kindly inform me. Otherwise, this issue is basically unactionable.

Update: this might be possible if we allow custom breadcrumb target links. See #6953

@fnune
Copy link
Author

fnune commented Apr 25, 2022

I think the warning "Add an index page for each category" sounds like a good idea. It's not only a potential SEO problem, it's also a UX issue because the breadcrumb that's generated isn't clickable.

I will follow your suggestion of adding index pages to category entries.

On another note, do you think I should file a different issue for this?

My personal view: a breadcrumb without a link isn't really a breadcrumb and shouldn't be added to the page. Each breadcrumb is meant to represent a page the user has visited to get to their current page. The user never visited a category page, so there should be no item for that. This would also solve the microdata issue.

E.g. "Generate no breadcrumb for category sidebar entries without a page".

@Josh-Cena
Copy link
Collaborator

That sounds like a viable solution as well. Users should be able to efficiently filter out all breadcrumbs without links because the UX could be potentially confusing.

Again, we could either (a) try to figure out a better heuristic so that all items have href or (b) ask users to explicitly provide an href if there's no category link, once we have #6953

@fnune
Copy link
Author

fnune commented Apr 25, 2022

Defaulting to link: { type: 'generated-index' } sounds reasonable but it could be considered a breaking change. Users will need to add good-looking descriptions to all their pages so that the generated indexes look good.

@fnune
Copy link
Author

fnune commented Apr 25, 2022

Defaulting to link: { type: 'generated-index' } sounds reasonable but it could be considered a breaking change. Users will need to add good-looking descriptions to all their pages so that the generated indexes look good.

A problem with this would be that the shorthand syntax encourages sidebar items that are not really links and potentially don't need to be. E.g. https://docusaurus.io/docs/sidebar/items#using-shorthands

So I think:

(a) try to figure out a better heuristic so that all items have href or (b) ask users to explicitly provide an href if there's no category link, once we have #6953

(a) is difficult (can't always generate a proper link), but (b) might be OK as long as there's e.g. a build error?

@Josh-Cena
Copy link
Collaborator

Josh-Cena commented Apr 25, 2022

No, not defaulting category indexes, because that doesn't make much sense. But defaulting breadcrumb links is something we can do (i.e. even if there's no category link, we can find a link, like the first doc). In fact, we already have such algorithm: if JS is disabled, you can still navigate through the categories, because each category will always have a link that's clickable, just that it may be removed after React hydration.

@fnune
Copy link
Author

fnune commented Apr 25, 2022

No, not defaulting category indexes, because that doesn't make much sense.

You're right, thanks for the nudge. That pushed me to amend my own PR on my docs site.

But defaulting breadcrumb links is something we can do (i.e. even if there's no category link, we can find a link, like the first doc).

Sounds reasonable. That would also be a solution to this issue.

@fnune
Copy link
Author

fnune commented Apr 26, 2022

Quick update:

After applying the suggested fixes, now Google still gets a bad ID from only the last item in the breadcrumbs. See the breadcrumbs in https://docs.memfault.com/docs/mcu/arm-cortex-m-guide/

And the attached rich results test... results here https://search.google.com/test/rich-results/result?id=hIFxmU6Zz2gIigWiHig8_w

image

image

HTML for that last breadcrumb item:

<li
  itemscope=""
  itemprop="itemListElement"
  itemtype="https://schema.org/ListItem"
  class="breadcrumbs__item breadcrumbs__item--active"
>
  <span class="breadcrumbs__link" itemprop="item name">ARM Cortex-M</span>
  <meta itemprop="position" content="3">
</li>

@Josh-Cena
Copy link
Collaborator

This is fixed by #7179 already.

@fnune
Copy link
Author

fnune commented Apr 26, 2022

Sweet, thanks!

Leaving it up to you whether to keep this issue around or close it. I believe it still represents a problem of bad defaults which leads to the generation of bad SEO microdata.

If you'd like me to I can also submit a different issue for that, and then we can close this one.

@Josh-Cena
Copy link
Collaborator

I don't know what's a good call. The only ways we can meaningfully move this forward are:

(a) Find a way to tell Google to not generate id for items without href. The BreadcrumbList schema, I've checked several times, does not mandate each item to have an id, so I doubt if it's really necessary (Google doesn't issue an error anyways—only a warning)
(b) Find a good heuristic to ensure every breadcrumb item has an href. It's slightly more opinionated and I'm skeptical about that.
(c) Allow you to customize breadcrumb link, i.e. #6953

Option (c) seems the most viable, but I'm also desperate trying to find an answer to (a), even just for curiosity's sake😅

But anyways, I'm going to go ahead and close this since if we agree that (c) is the best choice (and for the time being, you as user need to ensure that every category has a link).

@Josh-Cena Josh-Cena added closed: working as intended This issue is intended behavior, there's no need to take any action. and removed status: needs more information There is not enough information to take action on the issue. labels Apr 26, 2022
@slorber
Copy link
Collaborator

slorber commented Apr 27, 2022

Some thoughts:

  • Note sure how to fix the google-related id problem either

  • Having a plugin option to enable the ability to default to link: {type: 'generated-index'} makes sense to me. In this case users would be able to use category shorthand, and would opt-out with link: false | null

  • The heuristic we use when JS is disabled can be a good-enough workaround that should work in most cases (href is first link found in the category, including its children items)

  • If breadcrumb does not have link, we could remove the microdata?

  • In all cases you can swizzle --eject DocBreadcrumbs and customize the existing logic

  • Breadcrumb customization options? Please comment on Add Breadcrumb customization options #6953 about this use-case and what kind of API you think about to solve the current problem.

@Josh-Cena
Copy link
Collaborator

If breadcrumb does not have link, we could remove the microdata?

I thought about that at the time I was implementing it, but I don't like the idea. A breadcrumb item should be allowed without a link (at least the schema allows it, not to say it's any good for SEO anyways)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An error in the Docusaurus core causing instability or issues with its execution closed: working as intended This issue is intended behavior, there's no need to take any action.
Projects
None yet
Development

No branches or pull requests

3 participants