Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Add script to identify broken links #186

Merged
merged 4 commits into from
Sep 22, 2022

Conversation

ckadner
Copy link
Member

@ckadner ckadner commented Sep 15, 2022

Motivation

Broken links in documentation are making it harder for new comers to learn about KServe and often it is not easy to find what the correct link should be. For project maintainers it is time consuming to revisit documentation to identify broken links manually. It certainly does not make the best impression on new comers if they click on links that return 404 errors.

Proposed Changes

Add new script docs/hack/verify-doc-links.py to identify 93 broken links of the currently 1291 links (581 unique URLs) in 153 Markdown files.

I made a "best-effort" attempt to fix as many the broken links identified by the new script. 11 broken links are remaining. I might need some help from the KServe community to address the remaining ones.

Example

$ ./docs/hack/verify-doc-links.py
Checked 1292 links (572 unique URLs) in 153 Markdown files.

docs/modelserving/v1beta1/lightgbm/README.md:33: https://github.com/kserve/kserve/python/lgbserver -> 404
docs/modelserving/v1beta1/serving_runtime.md:34: https://github.com/kserve/kserve/tree/master/python/lightgbm -> 404
docs/sdk_docs/docs/V1alpha1ClusterServingRuntime.md:11: /docs/sdk_docs/docs/.md -> 404
docs/sdk_docs/docs/V1alpha1ServingRuntime.md:11: /docs/sdk_docs/docs/.md -> 404
docs/sdk_docs/docs/V1beta1CustomExplainer.md:24: /docs/sdk_docs/docs/ResourceQuantity.md -> 404
docs/sdk_docs/docs/V1beta1CustomPredictor.md:24: /docs/sdk_docs/docs/ResourceQuantity.md -> 404
docs/sdk_docs/docs/V1beta1CustomTransformer.md:24: /docs/sdk_docs/docs/ResourceQuantity.md -> 404
docs/sdk_docs/docs/V1beta1ExplainerSpec.md:33: /docs/sdk_docs/docs/ResourceQuantity.md -> 404
docs/sdk_docs/docs/V1beta1PodSpec.md:24: /docs/sdk_docs/docs/ResourceQuantity.md -> 404
docs/sdk_docs/docs/V1beta1PredictorSpec.md:33: /docs/sdk_docs/docs/ResourceQuantity.md -> 404
docs/sdk_docs/docs/V1beta1TransformerSpec.md:30: /docs/sdk_docs/docs/ResourceQuantity.md -> 404

ERROR: Found 11 invalid Markdown links

How does the script work

  • Find all Markdown files in the project
  • Extract all Markdown-style links and plain URLs
  • Create unique list of simplified URLs
  • Use local files to check relative links, use cached HTTP requests for links outside project
  • Run HTTP requests in parallel
  • Retry GitHub rate-limited URLs after (at least) one minute
  • Report broken links with source file and line number

Script pedigree: I contributed several iteration of this script to MLX and KFP-Tekton projects, and most recently to ModelMesh. This latest one differs from earlier version in that it does not require any non-standard libraries like for Markdown or requests to be installed into a virtual environment prior to running the script.

Possible future enhancements

Add a new PR check to run the docs/hack/verify-doc-links.py script with any PR to identify broken links as soon as possible and hopefully have the PR contributor fix the broken link, even if outside the scope of the PR.


/cc @njhill

* Find all Markdown files in the project
* Extract all Markdown-style links and plain URLs
* Create unique list of simplified URLs
* Use local files to check relative links, use cached
  HTTP requests for links outside project
* Run HTTP requests in parallel
* Retry GitHub rate-limited URLs after one minute
* Report broken links with source file and line number

Signed-off-by: Christian Kadner <[email protected]>
Signed-off-by: Christian Kadner <[email protected]>
@netlify
Copy link

netlify bot commented Sep 15, 2022

Deploy Preview for elastic-nobel-0aef7a ready!

Name Link
🔨 Latest commit 9dabd17
🔍 Latest deploy log https://app.netlify.com/sites/elastic-nobel-0aef7a/deploys/6323f35503c0be000924f825
😎 Deploy Preview https://deploy-preview-186--elastic-nobel-0aef7a.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

Signed-off-by: Christian Kadner <[email protected]>
Signed-off-by: Christian Kadner <[email protected]>
@ckadner
Copy link
Member Author

ckadner commented Sep 19, 2022

/cc @yuzisun

@yuzisun
Copy link
Member

yuzisun commented Sep 22, 2022

Thanks @ckadner !!

/lgtm
/approve

@kserve-oss-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ckadner, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kserve-oss-bot kserve-oss-bot merged commit f7aa924 into kserve:main Sep 22, 2022
rachitchauhan43 pushed a commit to rachitchauhan43/website that referenced this pull request Nov 21, 2022
* chore: Add script to verify Markdown links

* Find all Markdown files in the project
* Extract all Markdown-style links and plain URLs
* Create unique list of simplified URLs
* Use local files to check relative links, use cached
  HTTP requests for links outside project
* Run HTTP requests in parallel
* Retry GitHub rate-limited URLs after one minute
* Report broken links with source file and line number

Signed-off-by: Christian Kadner <[email protected]>

* Fix broken links

Signed-off-by: Christian Kadner <[email protected]>

* Refine URL exclusion patterns

Signed-off-by: Christian Kadner <[email protected]>

* Fix more broken links

Signed-off-by: Christian Kadner <[email protected]>

Signed-off-by: Christian Kadner <[email protected]>
Signed-off-by: rachitchauhan43 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants