-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Housekeeping #17
Housekeeping #17
Conversation
Additional note: The proposed CI file builds and published the docker-image automatically into the GitHub container registry (ghcr). This is done mainly for illustrative purposes. This can be fairly easily modified to push to docker-hub. I chose GHCR as it does not require special authentication. Or rather, the authentication is handled by GitHub. |
Wow! Thank you for the work on this. Packaging has really moved on lately.
They seem functionally identical. Let's do MIT since it's more common.
Works for me
I've always done maintainer-controlled requirements but I'm good with CI if that simplifies releases.
Should this be in the dockerfile somehow? Is the Docker image getting gunicorn included? Speaking of simplifying releases, I think this PR could use a little documentation in the README:
All in all, I agree with the decisions you have made in the PR. |
As discussed in psf#17
Something that would make packaging easier is if we could limit it to two officially supported use-cases:
This would completely remove the need for a "lock" file. Projects using it as library do not need that anyway already, and having an official docker image has all dependencies locked into that image anyway. Maintaining standalone apps in Python is a bit tricky as you never know the state of the end-user system. One of my colleagues at work maintains one such project and it is constantly giving us headaches. We are thinking of using FlatPak or Appimage to package and distribute the application in the future to have more reproducible installations. @kevin1024 What are your thoughts on this? Would you be okay of limiting the official distribution to the two use-cases mentioned above? |
References psf#17 The docker-image has all dependencies "frozen" in place, so a requirements.txt is not really needed. It would be difficult (but possible) to ensure that the docker-image uses the same `requirements.txt` as the one that was published. It adds a considerable overhead to the build complexity and is not really needed. Supporting multiple end-user environments (and operating-systems) would also require *multiple* such files. And even then it is not guaranteed to work everywhere. Limiting the officially supported targets to "library" and "docker" removes those complexities.
I've made a bunch of changes and adressed the above comments. I added a new section "Maintenance" to the README which should help. With this PR, releases would be automatically triggered when the following conditons are met:
Concerning the docker-image location: the workflow is currently configured to use the GitHub registry. But using docker-hub would be beneficial. It makes the "docker pull" command a tiny bit easier for end-users. |
Actually.... scratch that... I have updated the README with the approprioate docker image-names (using the GitHub registry). It's not that much more difficult. I also provided a "latest" tag to allow pulling an image without specifying a tag. I do not know if the GHCR has any limitations compared to docker-hub. And, after all this fiddling around I'm not really in the mood to dig into that topic.... so there's that :) |
I am 100% OK with that. There may be other maintainers on this project that feel otherwise though. @nateprewitt, @sigmavirus24, and @timofurrer if you have opinions feel free to weigh in; otherwise I'm happy to keep this project moving. I just don't want to overstep. |
So here's what I'm thinking about specifically with a Flask
flassger
importlib-metadata
six We can That then allows that if we release a Docker image today. And tomorrow the base image we rely on has a CVE reported in it, we could trigger a new build with the same dependencies and an updated base layer to allow users to not have to continue to have a CVE in some portion of their stack (test, or whatever) that is vulnerable to something (anything) because more and more people and companies are reporting on this kind of data. Users shouldn't have to wait for us to release a new version in an unknown period of time that has a new base image. Alternatively, we don't need to auto-publish a docker image. Simply having the dockerfile could be sufficient for those users. But I want to be able to know that if we build the image today and then a week today with the only intended change being the base image, we would have the same tested dependency configuration (unless we explicitly chose to upgrade that) between the two. Otherwise, some transitive dependency could update and break everything without us noticing between those two image publishes and we'd never know. |
@sigmavirus24 I see what you mean. In this case we could also directly use pip-compile on the pyproject.toml. You mentioned earlier that I'm still trying to figure out how best to handle the "reproducible build" problem for the docker image that you mention. The CI jobs run on images provided by GitHub (f.ex. I will also see that I can commit that change during the CI run. This way, both the I'll work on that later today. |
The last commit provides a The requirements is generated early (using the The file is then used to bootstrap the environment for unit-testing. As the If the above issue arises, it only means that the unit-tests are running The "docker build" step then uses that same Finally, after everything passed, and the commit is tagged as release, both the These "release-file" are only provided as secondary way to build the image I also added an appropriate section to the README file. |
.github/workflows/ci.yml
Outdated
sudo apt-get update | ||
sudo apt-get install -y python3-pip | ||
sudo pip install --upgrade pip | ||
sudo pip install yq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these necessary? Also, why do we need sudo here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yq
contains tomlq
and is a wrapper around jq
which makes it very easy to extract meta-data from a TOML file. I use this to fill-in docker labels during build. This allows us to keep all that meta-data in one single place (the pyproject.toml
).
Instead of sudo
I can switch to a venv instead. I will still need sudo to install the python3-venv
package then. Since Python 3.11, pip install
raises an error, even when doing a "user-install". So a venv will be necessary. I will make that change and leave this thread unresolved in case you have additional feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to using a venv has been applied.
Dockerfile
Outdated
COPY --from=build /opt/httpbin /opt/httpbin | ||
ADD httpbin.bash /opt/httpbin/bin | ||
RUN chmod +x /opt/httpbin/bin/httpbin.bash | ||
EXPOSE 80 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're breaking the API of where to get the image from, right? If we're doing that, why not also not use root and expose a different port. People can always remap them on the outside as they wish.
Would include adding a USER httpbin
and running the service in the bash script with a different port but that should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the same thing but wanted to keep the external changes (of the running image) to a minimum.
If you are okay with that change I will gladly do it as it will simplify deployments in non-privileged docker runtimes (like OpenShift) which enforce non-root images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pushed the necessary changes in 4923332
If we decide to keep the old external behaviour (port 80 by default) we can easily revert that commit. With this change, the HTTPBIN_PORT
customisation that I provided in #4 becomes moot and we could drop that again. It messes with the EXPOSE
information of the image anyway and was only added as a workaround to make the image run on environments that enforce non-root containers.
Should I remove that again? Only one other user expressed interest and it was only released for a short time. I'm generally against dropping something once it's released. But maybe we could make an exception here?
I saw the failures on Windows. I'm currently away and can look at them at the earliest tomorrow. But in general they are unrelated to the proposed changes. Just some PowerShell shenanigans. |
The cleanup of the dependencies I did recently removed a dependency which was needed after-all. The image built fine without it, but the process did not start. That dependency (greenlet) now causes issues on Python 3.12. I will have to look into that in detail. I will also look into a way to test a proper startup of the container in the CI to catch theses issues early. |
The Python 3.12 issues come from The fix is however only in a pre-release/alpha of greenlet. To limit the risk of making a major version bump (and to an alpha version at that), I only apply that change for Python 3.12 which is itself in pre-release as well via 715d6f7 Once greenlet has release a full/final 3.x release it would be good to simplify these dependencies. |
From what I can tell there are no big issues left in this PR. The biggest question would be the list of maintainers. I don't know who would qualify as maintainer. People with write access to the pypi repo maybe? The other open questions are "fine-polish" in my opinion. All should be fine even without addressing them. They can always be handled down the road even after the PR is done. Even the maintainer-question can be handled later. The current release mentions Kenneth alone anyway. So it's not like its removing/breaking anything existing. As a consequence I woundn't mind if we could put a bow on this PR and finish it up. |
Looks great! Thanks to @exhuma for all your work and to @sigmavirus24 and @nateprewitt for your review! |
I too want to thank @sigmavirus24 for the in-depth and detailed review. |
@kevin1024 in case you need further help, feel free to reach out to me. Either via my committer e-mail or just by assigning an issue to me. |
This PR contains some opinionated housekeeping changes (#16). Feel free to add
counter-arguments to the proposed changes. As they are subject of personal
preference, the opinions may vary and I am open to discussion.
Please consider this PR a draft proposal.
Actionable points in this PR (discussed below):
mainapp
okay as name for the extra-dependency?requirements.txt
file?The CI or the maintainer? Note that they may be different on each OS, so
generating them in the CI might be beneficial.
Reasons why I chose to make these changes
Dropping
Pipfile
andsetup.py
in Favor ofpyproject.toml
By moving everything into
pyproject.toml
, we remove duplication. And thusalso remove the risk of diverging dependencies. This was in fact already the
case. Looking at
Pipfile
I found the following divergence fromsetup.py
:gunicorn
(andgevent
) was missing insetup.py
. This makes sense as itis only required when running
httpbin
as a main process. I added anextra-dependency called
mainapp
to cover this use-case. This makes itinstallable as runnable application using
pip install httpbin[mainapp]
. Wemight rename that extra to something else though.
six
was missing insetup.py
. It has been added topyproject.toml
pyyaml
,rope
andmeinheld
have no imports anywhere. They have beendropped.
werkzeug
had a minimum version inPipfile
, but not insetup.py
. Thishas been merged into
pyproject.toml
setup.py
mismatched with theLICENSE
file (ISC vs MIT). This has now been consolidated. It might be necessary to
review which of the two is actually the correct one.
This has the main disadvantage that the convenience commands from
pipenv
areno longer available. They can be useful for managing virtual-envs and adding
new dependencies to the project.
In my opinion (and experience), those use-cases don't surface often in a
project. And the benefits of having everything in one place (i.e.
pyproject.toml
) outweigh those rare conveniences.pip-tools instead of Pipfile
Dropping
pipenv
also removes the possibility to use thepipenv lock
command. This can be easily reproduced using
pip-compile
from thepip-tools
package. It has the advantage of producing a standard
requirements.txt
file(instead of the
Pipfile.lock
file).Removal of the
VERSION
fileThis file caused an import side-effect. It is accessed as soon as
httpbin
isimported to provide the
httpbin.version
variable. In addition, reading thecontents of that file does not necessarily mean that it is the installed
package in case there are bugs in the project.
It is better to reach into the project meta-data from the environment.
Since Python 3.8 (backport exists), importlib.metadata.version can handle
this just fine. And works well with
pyproject-build
. This PR makes use ofthis, also including the backport for older Python versions.
Docker Build Automation &
requirements.txt
The timing of the generation of the "lock" file (independent of
pipenv
orpip-tools
) is currently / with this PR left to the maintainer by runningpip-compile --upgrade --generate-hashes
and committing the result.The requirements file could also be dropped from the repo and generated
"on-the-fly" during the CI run. In that case it is important to ensure that the
unit-tests are run with the same file.
An example multi-stage CI flow (using "needs") has been added to the repo. This
could be used to generate the needed files.
The
requirements.txt
file is only needed when runninghttpbin
as mainapplication. When it is used as a library in another project, the
responsibility of generating the
requirements
file falls to that project. So,as a result, it is not really necessary to package the
requirements.txt
fileand could very well be generated on-the-fly when generating the docker image.
Publishing Via GitHub actions
The new proposed workflow contains a job to publish to pypi. All that is
required is to create a workflow secret with the name
PYPI_TOKEN
.Other Side-Effects and Notes
test_suite
key fromsetup.py
is lost. This is - in my opinion -negligable.
include_package_data
key was dropped because it istrue
by default.this timeout, causing runs to fail seemingly at random.