Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] update dead link in UDF doc #10322

Closed
nvliyuan opened this issue Jan 30, 2024 · 5 comments · Fixed by #10309
Closed

[DOC] update dead link in UDF doc #10322

nvliyuan opened this issue Jan 30, 2024 · 5 comments · Fixed by #10309
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@nvliyuan
Copy link
Collaborator

We already migrate getting started docs to new pages, so we need to fix all dead link in gh-page branch and original docs. And we need to find out why markdown-link checker does not find it.

One Frequently Asked Questions entry dead link in UDF doc,

@nvliyuan nvliyuan added documentation Improvements or additions to documentation ? - Needs Triage Need team to review and classify labels Jan 30, 2024
@nvliyuan nvliyuan self-assigned this Jan 30, 2024
@nvliyuan
Copy link
Collaborator Author

nvliyuan commented Jan 30, 2024

The reason why markdown link checker did not find it is because we just checked the modified files.
https://github.com/NVIDIA/spark-rapids/blob/branch-24.02/.github/workflows/markdown-links-check.yml#L33
There are too much links in the repo, if we check all it will cost 5-10 mins for each commits.
Let me check all only for this pr to fix all dead links, and we need to do an all-check if there is a huge doc change pr.
image
CC @viadea

@nvliyuan
Copy link
Collaborator Author

@jlowe
Copy link
Contributor

jlowe commented Jan 30, 2024

we just checked the modified files

This seems like a flawed approach. It will detect bad links added to the modified files, but it cannot catch the problem of moving/removing a target of a link somewhere else that was not modified (and should have been).

How long does it actually take to run the full scan? Our CI build already takes 2+ hours, so I'd be shocked if link checking was the limiting factor for PR workflows. If it's still deemed too expensive to run the full check per PR, we would want to run this regularly (nightly?) to check for bad links.

@viadea
Copy link
Collaborator

viadea commented Jan 30, 2024

I think we need to run the full checks either nightly or CI. There are huge possibility that the existing doc might have broken links.

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jan 30, 2024
@nvliyuan
Copy link
Collaborator Author

It will take 5-10 minutes for each commit to check all links, if the CI is two hours then checking all links has very little impact. Will enable all checks.

GaryShen2008 pushed a commit that referenced this issue Feb 6, 2024
* add custom 404 page

Signed-off-by: liyuan <[email protected]>

* Update .github/404.html

Co-authored-by: Gera Shegalov <[email protected]>

* test markdownlink checker

Signed-off-by: liyuan <[email protected]>

* check all links

Signed-off-by: liyuan <[email protected]>

* fix all dead links

Signed-off-by: liyuan <[email protected]>

* revert markdown link checker config files after fix all dead links

Signed-off-by: liyuan <[email protected]>

* enable all links checking as discussed in #10322

Signed-off-by: liyuan <[email protected]>

* we already support zstd orc and parquet write, fix #10365

Signed-off-by: liyuan <[email protected]>

---------

Signed-off-by: liyuan <[email protected]>
Co-authored-by: Gera Shegalov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants