Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: improve packaging #688

Open
laurentsimon opened this issue Jul 13, 2021 · 14 comments
Open

Feature: improve packaging #688

laurentsimon opened this issue Jul 13, 2021 · 14 comments
Labels

Comments

@laurentsimon
Copy link
Contributor

laurentsimon commented Jul 13, 2021

Improvements:

  1. the Packaging checks only looks for GH packaging workflows. This is not the only way to publish code. We should check for the presence of the package on language repos.
    Example:
    for npm: the package.json has a "repository" field, and metadata may be available from the npm API. Alternatively, we could look at the name in package.json of the repository, then check npm to see if that package exists.

  2. The check currently uses regex, we should switch to parsing properly.

  3. we're missing some of the registries in https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-rubygems-registry

  4. we're missing go packages, see https://github.com/ossf/scorecard/blob/main/.github/workflows/goreleaser.yaml

  5. we're missing github marketplace actions

  6. Update the Token-Permission workflow as well, as it also checks for the need of packages permission.

@laurentsimon laurentsimon added the kind/enhancement New feature or request label Jul 13, 2021
@laurentsimon laurentsimon added this to the milestone-q3 milestone Aug 2, 2021
@laurentsimon
Copy link
Contributor Author

Improvements:

  1. the Packaging checks only looks for GH packaging workflows. This is not the only way to publish code. We should check for the presence of the package on language repos.

For golang, we would like to query the https://pkg.go.dev/ and see if a corresponding package exists

@jba
Copy link

jba commented Oct 24, 2021

pkg.go.dev developer here.

pkg.go.dev learns everything it knows from the Go module proxy, https://proxy.golang.org. Visit that page for a description of the protocol.

So the proxy is the source of truth, and it can also handle much higher QPS than us. However, it's less discriminating: it doesn't examine what it's given to make sure it's really a Go module. For instance, it doesn't check for the presence of .go files. (No .go file, no module.) We do.

So if that is important to you, then checking pkg.go.dev is reasonable, provided it's at relatively low QPS. If you know the version of the module you're looking for, supplying it will reduce load on us. A sufficient check for existence is to check the status of a HEAD request, and treat anything other than 200 as false.

@naveensrinivasan
Copy link
Member

pkg.go.dev developer here.

pkg.go.dev learns everything it knows from the Go module proxy, https://proxy.golang.org. Visit that page for a description of the protocol.

So the proxy is the source of truth, and it can also handle much higher QPS than us. However, it's less discriminating: it doesn't examine what it's given to make sure it's really a Go module. For instance, it doesn't check for the presence of .go files. (No .go file, no module.) We do.

So if that is important to you, then checking pkg.go.dev is reasonable, provided it's at relatively low QPS. If you know the version of the module you're looking for, supplying it will reduce load on us. A sufficient check for existence is to check the status of a HEAD request, and treat anything other than 200 as false.

Thanks, is there an API for the pkg.go.dev? It would help a lot instead of doing HTML parsing.

@jba
Copy link

jba commented Oct 24, 2021

There is no API, but if you're just checking for existence you don't need to parse HTML. Is there some other information you need?

@laurentsimon
Copy link
Contributor Author

Thanks @jba So essentially we just need to check for the HTTP status. Should be good enough for our current use case I think. @naveensrinivasan anything else you think we need?

@naveensrinivasan
Copy link
Member

I can't think of anything as of now. Thanks

@laurentsimon
Copy link
Contributor Author

@di what can be done on the pypi side, similar to #688 (comment)?

@di
Copy link
Member

di commented Nov 1, 2021

IIUC you're looking for a way to determine if a given project name is published on PyPI?

That would require checking if the name exists (via HTTP status) at either:

  • Simple API: https://pypi.org/simple/<project_name>
  • JSON API: https://pypi.org/pypi/<project_name>/json

More details on available APIs here: https://warehouse.pypa.io/api-reference/

@laurentsimon
Copy link
Contributor Author

That's exactly what we need. Thank you!

@laurentsimon
Copy link
Contributor Author

@di is there an API that takes as input a GitHub repository instead of a package name? Or would we need to query the "project links" to infer the package name to input to the API? If so, what's the best way to do it?

@di
Copy link
Member

di commented Feb 2, 2022

@laurentsimon You mean that you have a GitHub repo and you want to determine what PyPI project it corresponds to?

There's a couple ways:

  • Build the project hosted in the repo (complicated, might not work), see what package name it produces, and assume this corresponds to a project on PyPI (it might have the same name but not be the same project, so it's not guaranteed)
  • Build a mapping of project links to PyPI packages (no API exists for this) and make the assumption that the metadata is correct (anyone can put any project link pointing to any GitHub repo they want, so it's not guaranteed)
  • Wait for PyPI's OIDC integration & support for publishing from GitHub Actions to land, which requires a strong link between a GitHub repo and a PyPI project (guaranteed, and we can make this available via an API)

@laurentsimon
Copy link
Contributor Author

Thank you @di Let's wait until OIDC provides the magic.

@laurentsimon
Copy link
Contributor Author

laurentsimon commented Aug 23, 2022

We need to decide the purpose of this check: is it to know that there is a corresponding package for the ecosystem, or whether the publishing steps occur on CI and not a local dev machine? If the latter, we only need to look for additional commands and not worry about querying package registries.

@laurentsimon
Copy link
Contributor Author

Also note that for Go projects, this checks will fail and it's a false negative. Go project don't need to be "released", since they are fetched directory from the repository (and then cached) when consumers do go install or go get

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

5 participants