-
-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow user agents to be customized in robots.txt #2109
Comments
A related topic has been recently discussed here #2067, and while I would prefer not to expect people to customize the robots.txt by providing a file, I agree certain level of customization should be possible. I mentioned some of the problems and history from current implementation here #2067 (reply in thread), and I already put together and merged a feature to allow all short URLs to be crawlled by default, if desired #2107, which would result in the same you mentioned above, but for any crawler, not just facebook's specifically. On top of that, the only missing piece would be to allow you to provide a list of user agents you want to allow, falling back to That said, you can already make your short URLs crawlable, with the limitation that it needs to be done one by one, hence the PR above. |
Thanks! I'll take a look at #2107 next time! |
I'm going to re-purpose this issue to specifically allow user agents to be customized in robots.txt. That plus the already existing capabilities around robots.txt should cover most use cases in a more predictable and reproducible way. Later on, if there's still some missing capability, I'm open to discuss more improvements and features. |
That's cool @acelaya !! Thank you!! |
This feature is now implemented and will be part of Shlink 4.2 |
Summary
An ability to read a text file that contains customization of robots.txt so that customization can be backed up or be persisted outside the docker container.
Use case
I've been been editing the module/Core/src/Action/RobotsAction.php file inside the container as I (and possibly many other people with similar needs) would like to allow Facebook's bot[1] so that as I paste Shlink's links will have article preview. But this was broken when I switched to stable-roadrunner (great image btw!) because -- obviously -- I forgot to add my robots.txt customization.
Since this feature would be pretty straight forward (as I already know which file output robots.txt content) I was thinking to add it by myself, but I'm not sure this -- externalize a part of robots.txt for user to persist container data -- is a good idea, so I would like to validate this idea with you if I can add this feature.
Thanks for the great work folks btw!
[1] Allowing Facebook's user-agent in robots.txt
The text was updated successfully, but these errors were encountered: