Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robots.txt Plugin #570

Merged
merged 4 commits into from
Feb 14, 2024
Merged

robots.txt Plugin #570

merged 4 commits into from
Feb 14, 2024

Conversation

kwaa
Copy link
Member

@kwaa kwaa commented Feb 13, 2024

Description

I didn't add the comment in the Rule type for now due to the complexity.

Related Issues

closed #543

Check List

  • Have you read the
    CODE OF CONDUCT
  • Have you read the document
    CONTRIBUTING
    • One pull request per feature. If you want to do more than one thing,
      send multiple pull request.
    • Write tests.
    • Run deno fmt to fix the code format before commit.
    • Document any change in the CHANGELOG.md.

@kwaa kwaa marked this pull request as draft February 13, 2024 15:14
@oscarotero
Copy link
Member

The plugin looks great, thank you!
I just have a couple of questions:

  • Maybe the plugin name should be robots_txt, instead of simply robots, because I think is clearer about what the plugins does. What do you think?
  • I like the simplicity of the configuration, with an array of allow and disallow. If I only want to allow Google and Bing, the configuration is:
    site.use(robots({
      allow: ["Googlebot", "Bingbot"],
    }));
    But this doesn't prevent other bots to scan the site because I'm not explicitly saying that they shall not do it. I don't know if this is the expected behavior. I guess the proper way to do it is:
    site.use(robots({
     allow: ["Googlebot", "Bingbot"],
     disallow: ["*"],
    }));
    Is that correct?

@kwaa
Copy link
Member Author

kwaa commented Feb 13, 2024

  • Maybe the plugin name should be robots_txt, instead of simply robots, because I think is clearer about what the plugins does. What do you think?

I don't think there is any problem with the file name, but the variable name will be robotsTxt(), which feels a bit ugly.

  • I like the simplicity of the configuration, with an array of allow and disallow. If I only want to allow Google and Bing, the configuration is:
    site.use(robots({
      allow: ["Googlebot", "Bingbot"],
    }));
    But this doesn't prevent other bots to scan the site because I'm not explicitly saying that they shall not do it. I don't know if this is the expected behavior. I guess the proper way to do it is:
    site.use(robots({
     allow: ["Googlebot", "Bingbot"],
     disallow: ["*"],
    }));
    Is that correct?

Yes.

@oscarotero oscarotero marked this pull request as ready for review February 14, 2024 19:54
@oscarotero oscarotero merged commit 41d1aed into main Feb 14, 2024
6 checks passed
@oscarotero oscarotero deleted the plugin/robots branch February 14, 2024 19:54
@oscarotero
Copy link
Member

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

robots.txt plugin
2 participants