Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Dangerzone environments and run QA tests on them #289

Merged
merged 12 commits into from
Jan 16, 2023
Merged

Conversation

apyrgio
Copy link
Contributor

@apyrgio apyrgio commented Dec 12, 2022

This PR introduces two scripts that can help with Dangerzone development and releases. Those are:

  • dev_scripts/env.py: This script creates Dangerzone environments on various Linux platforms. These environments can be used to test Dangerzone and build artifacts.
  • dev_scripts/qa.py: This script runs QA tests on the environments that the previous script creates. These QA tests are taken from our release instructions, but they are not fully automated. They act as guard rails for the developer who wants to ensure that Dangerzone works for a platform.

Closes #286
Closes #287

@deeplow
Copy link
Contributor

deeplow commented Dec 15, 2022

I had some specific comments about the windows adaptation of the install/windows/build-image.py but I think I have a better suggestion that works on all OSes - install/build-image.py:

from dangerzone.container import get_runtime_name
import subprocess

runtime = get_runtime_name()

def run(cmd):
    subprocess.run(cmd, shell=True)

print("Building container image")
run(f'{runtime} build container --platform linux/amd64 --tag dangerzone.rocks/dangerzone')

print("Saving and compressing container image")
run(f'{runtime} save dangerzone.rocks/dangerzone | gzip > share/container.tar.gz')

print("Looking up the image id")
run(f'{runtime} image ls dangerzone.rocks/dangerzone | grep "dangerzone.rocks/dangerzone" | tr -s \' \' | cut -d\' \' -f3 > share/image-id.txt')

This way we would de-duplicate some of the code and remove the added gzip complexity in python code. I tested on Linux and this python script is 3 times faster than the proposed one.

What do you think about using this one? I can also add this as a separate PR as it's not core this this change.

Copy link
Contributor

@deeplow deeplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this solely on a Qubes system. I haven't tested yet on windows.

Feedback:

  • duplication: see this discussion.
  • **there could be a way to move back to a previous step. From playing around it appears to be easy to mistakenly move on to the next step.
  • have you considered mounting in the containers ~/.bashrc? It oculd be useful to have the history of commands so we don't have to repeatedly type them.
  • The build .deb step was much slower for me. It took 27mins. Is this slow performance to be expected?
  • In the container there is no PDF viewer, which leads to
    no-pdf-viewer
  • when an 'auto' is running and you press any key when the command is running, those keys will get pressed after the command runs. So if you press enter while waiting to see if something happens, then it presses enter on the next commands -- but because enter skips the task, you advance to the next ones. I don't know how to solve this problem easily, but changing the auto from enter to something else less clickable can help.

Some situations where 'auto' failed:

  • I've had many 'auto' commands related to lack of space. But that's to be expected. After increasing the size the went away. Because this may be so common, should we add a check for them?

  • Failure in auto step "Create Dangerzone QA environment" (fedora 36)

I was unable to find why this was failed, but after repeating the step, it worked.

[...]
  xcb-util-renderutil-0.3.9-20.fc36.x86_64                                      
  xcb-util-wm-0.4.1-22.fc36.x86_64                                              
  xkeyboard-config-2.35.1-1.fc36.noarch                                         
  xml-common-0.6.3-58.fc36.noarch                                               
  xz-5.2.5-9.fc36.x86_64                                                        
  yajl-2.1.0-18.fc36.x86_64                                                     

Complete!
--> de82eca9bdf
STEP 5/10: RUN rm /tmp/dangerzone-0.4.0-1.noarch.rpm
error running container: did not get container start message from parent: EOF
Error: building at STEP "RUN rm /tmp/dangerzone-0.4.0-1.noarch.rpm": slirp4netns failed
Traceback (most recent call last):
  File "/home/user/dangerzone/dev_scripts/env.py", line 509, in <module>
    sys.exit(main())
  File "/home/user/dangerzone/dev_scripts/env.py", line 505, in main
    return args.func(args)
  File "/home/user/dangerzone/dev_scripts/env.py", line 409, in env_build
    env.build(show_dockerfile=args.show_dockerfile)
  File "/home/user/dangerzone/dev_scripts/env.py", line 391, in build
    self.runtime_run("build", "-t", image, build_dir)
  File "/home/user/dangerzone/dev_scripts/env.py", line 235, in runtime_run
    subprocess.run(self.runtime_cmd + list(args), check=True)
  File "/usr/lib64/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['podman', 'build', '-t', 'dangerzone.rocks/fedora:36', PosixPath('/home/user/dangerzone/dev_scripts/envs/fedora/36/build')]' returned non-zero exit status 125.
Traceback (most recent call last):
  File "/home/user/dangerzone/./dev_scripts/qa.py", line 470, in <module>
    sys.exit(main())
  File "/home/user/dangerzone/./dev_scripts/qa.py", line 464, in main
    platform.start()
  File "/home/user/dangerzone/./dev_scripts/qa.py", line 294, in start
    self.build_qa_image()
  File "/home/user/dangerzone/./dev_scripts/qa.py", line 91, in inner
    out = func(self, *args, **kwargs)
  File "/home/user/dangerzone/./dev_scripts/qa.py", line 280, in build_qa_image
    self.run(
  File "/home/user/dangerzone/./dev_scripts/qa.py", line 116, in run
    subprocess.run(args, check=True)
  File "/usr/lib64/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/home/user/dangerzone/dev_scripts/env.py', '--distro', 'fedora', '--version', '36', 'build')' returned non-zero exit status 1.

Other suggestions:

  • copy test_docs to the container's home folder
    When testing I found myself wasting quite a bit of time with selecting the documents. If every time the container I had already the test documents in the home directory, I could select them more easily. Additionally, it would be nice if we had a fresh copy of the test_docs directory, since every time convert them, they are moved to the unsafe directory.

  • Report generation
    It would be nice at the end if the script generated some markdown that could be pasted on the QA issue for the release about what was tested. See my comment about prompt() on collecting notes about scenarios and then reporting them at the end.

Next steps (possibly future PRs):

  • Replace in CI build-* with building with your setup and deprecate packagecloud

.circleci/config.yml Outdated Show resolved Hide resolved
.circleci/config.yml Outdated Show resolved Hide resolved
install/windows/build-image.py Show resolved Hide resolved
install/windows/build-image.py Show resolved Hide resolved
dev_scripts/README.md Outdated Show resolved Hide resolved
dev_scripts/README.md Outdated Show resolved Hide resolved
dev_scripts/env.py Show resolved Hide resolved
dev_scripts/qa.py Outdated Show resolved Hide resolved
dev_scripts/env.py Show resolved Hide resolved
dev_scripts/qa.py Show resolved Hide resolved
@apyrgio
Copy link
Contributor Author

apyrgio commented Jan 2, 2023

I had some specific comments about the windows adaptation of the install/windows/build-image.py but I think I have a better suggestion that works on all OSes - install/build-image.py:

8< ... snip ... >8

Just to make sure I understand, do you propose this as a drop-in script for all OSes? Because the gzip executable will not be available on Windows

This way we would de-duplicate some of the code and remove the added gzip complexity in python code. I tested on Linux and this python script is 3 times faster than the proposed one.

That's weird. The script you suggest and the existing Linux one are very similar, so where do you attribute this boost?

What do you think about using this one? I can also add this as a separate PR as it's not core this this change.

In general, having a script for all OSes would be cool, so I'd be down with that, as a separate PR of course. We could even branch this discussion to a new issue.

@deeplow
Copy link
Contributor

deeplow commented Jan 2, 2023

I had some specific comments about the windows adaptation of the install/windows/build-image.py but I think I have a better suggestion that works on all OSes - install/build-image.py:
8< ... snip ... >8

Just to make sure I understand, do you propose this as a drop-in script for all OSes? Because the gzip executable will not be available on Windows

Ah. Damn. I forgot about that. That's probably the reason why the windows one was in python. I don't know where I concluded that it worked on all systems, then.

Let's do this in another PR as you suggest.

This way we would de-duplicate some of the code and remove the added gzip complexity in python code. I tested on Linux and this python script is 3 times faster than the proposed one.

That's weird. The script you suggest and the existing Linux one are very similar, so where do you attribute this boost?

Chunk sizes, perhaps?

@apyrgio
Copy link
Contributor Author

apyrgio commented Jan 4, 2023

Replying to the rest of the feedback (#289 (review)):

I tested this solely on a Qubes system. I haven't tested yet on windows.

Feedback:

* [ ]  **duplication:** see [this discussion](https://github.com/freedomofpress/dangerzone/pull/289#discussion_r1054373702).

Yeap, juicy subject, let's continue the discussion there.

* [ ]  **there could be a way to move back to a previous step. From playing around it appears to be easy to mistakenly move on to the next step.

We can do it, but is it worth doing it? I think it's one of the things we should do once we use this script more.

* [ ]  have you considered mounting in the containers `~/.bashrc`? It oculd be useful to have the history of commands so we don't have to repeatedly type them.

Good question. You probably refer to the .bash_history file, and I've in fact considered it. Currently, the state of the Dangerzone environment is just the Podman containers (/home/user/.local/share/containers), but that wasn't always the case. I used to mount the whole user's home (/home/user) as a directory, meaning that all user state would persist across executions (containers, bash history, Dangerzone config, you name it).

I dropped it because I need to think a bit if past state can affect subsequent executions. Most probably it won't, but it doesn't hurt us delaying this a bit. Are you ok with that?

* [ ]  The build `.deb` step was much slower for me. It took 27mins. Is this slow performance to be expected?

No, it shouldn't. I haven't experience such slow build times myself. We may need to debug this.

* [ ]  In the container there is no PDF viewer, which leads to
  ![no-pdf-viewer](https://user-images.githubusercontent.com/47065258/209135525-c9663944-58e1-48aa-bef8-b1c554b2b824.png)

I personally install mupdf, which is a light-weight PDF viewer, when running the QA steps. Do you have a PDF viewer that does not bring much dependencies?

* when an 'auto' is running and you press any key when the command is running, those keys will get pressed after the command runs. So if you press enter while waiting to see if something happens, then it presses enter on the next commands -- but because enter skips the task, you advance to the next ones. I don't know how to solve this problem easily, but changing the auto from `enter` to something else less clickable can help.

Good catch. I've mentioned in the corresponding discussion (#289 (comment)) that we can consume the process' stdin before prompting the user, which would solve these cases. We can discuss more there.

Some situations where 'auto' failed:

* I've had many 'auto' commands related to lack of space. But that's to be expected. After increasing the size the went away. Because this may be so common, should we add a check for them?

Hm, this gun shoots in two directions. Imagine the case where a step can run without extra space, but the system is low in resources. In that case, what will this check do? Cancel a reasonable execution? Throw a warning that will be lost in the log messages. I'm not sure how we can make this work, but if you have a suggestion, shoot it. Disclaimer, most of the issues I've encountered where related to low disk space as well, so I get your point.

* Failure in auto step "Create Dangerzone QA environment" (fedora 36)

I was unable to find why this was failed, but after repeating the step, it worked.

Interesting, I haven't seen this before. Is this reproducible? If yes, we can look into it. Else, I'd attribute it to something transient (maybe low disk space).

@deeplow
Copy link
Contributor

deeplow commented Jan 4, 2023

* [ ]  have you considered mounting in the containers `~/.bashrc`? It oculd be useful to have the history of commands so we don't have to repeatedly type them.

Good question. You probably refer to the .bash_history file, and I've in fact considered it. Currently, the state of the Dangerzone environment is just the Podman containers (/home/user/.local/share/containers), but that wasn't always the case. I used to mount the whole user's home (/home/user) as a directory, meaning that all user state would persist across executions (containers, bash history, Dangerzone config, you name it).

I dropped it because I need to think a bit if past state can affect subsequent executions. Most probably it won't, but it doesn't hurt us delaying this a bit. Are you ok with that?

Yes, I meant .bash_history. I get the point of keeping the state. But I wouldn't consider keeping past commands in history as state that will be of terrible consequence. If anything it will speed up testing. It's a quality-of-life improvement that I think would make a difference in practice. So I'd insist for us to keep it.

I personally install mupdf, which is a light-weight PDF viewer, when running the QA steps. Do you have a PDF viewer that does not bring much dependencies?

Nope. It can be that one or anything else you prefer.

Hm, this gun shoots in two directions. Imagine the case where a step can run without extra space, but the system is low in resources. In that case, what will this check do? Cancel a reasonable execution? Throw a warning that will be lost in the log messages. I'm not sure how we can make this work, but if you have a suggestion, shoot it. Disclaimer, most of the issues I've encountered where related to low disk space as well, so I get your point.

Just a warning would be my suggestion + the use of colors (bold, yellow). But we can tackle this in the future when we consider colors in the script.

Interesting, I haven't seen this before. Is this reproducible? If yes, we can look into it. Else, I'd attribute it to something transient (maybe low disk space).

Not reproducible. Let's just discard this issue.

@apyrgio
Copy link
Contributor Author

apyrgio commented Jan 5, 2023

* [ ]  have you considered mounting in the containers `~/.bashrc`? It oculd be useful to have the history of commands so we don't have to repeatedly type them.

Good question. You probably refer to the .bash_history file, and I've in fact considered it. Currently, the state of the Dangerzone environment is just the Podman containers (/home/user/.local/share/containers), but that wasn't always the case. I used to mount the whole user's home (/home/user) as a directory, meaning that all user state would persist across executions (containers, bash history, Dangerzone config, you name it).
I dropped it because I need to think a bit if past state can affect subsequent executions. Most probably it won't, but it doesn't hurt us delaying this a bit. Are you ok with that?

Yes, I meant .bash_history. I get the point of keeping the state. But I wouldn't consider keeping past commands in history as state that will be of terrible consequence. If anything it will speed up testing. It's a quality-of-life improvement that I think would make a difference in practice. So I'd insist for us to keep it.

Ok, I'll add this.

I personally install mupdf, which is a light-weight PDF viewer, when running the QA steps. Do you have a PDF viewer that does not bring much dependencies?

Nope. It can be that one or anything else you prefer.

Cool, I'll install it as well, both in the dev and the install environment.

Hm, this gun shoots in two directions. Imagine the case where a step can run without extra space, but the system is low in resources. In that case, what will this check do? Cancel a reasonable execution? Throw a warning that will be lost in the log messages. I'm not sure how we can make this work, but if you have a suggestion, shoot it. Disclaimer, most of the issues I've encountered where related to low disk space as well, so I get your point.

Just a warning would be my suggestion + the use of colors (bold, yellow). But we can tackle this in the future when we consider colors in the script.

Got it.

Interesting, I haven't seen this before. Is this reproducible? If yes, we can look into it. Else, I'd attribute it to something transient (maybe low disk space).

Not reproducible. Let's just discard this issue.

Ok.

@apyrgio apyrgio force-pushed the 286-linux-envs branch 5 times, most recently from 9fd0fd4 to 008ae27 Compare January 10, 2023 22:58
@apyrgio
Copy link
Contributor Author

apyrgio commented Jan 10, 2023

Hi @deeplow. I think I've addressed all the review comments that we talked about, and I fixed some issues tangent to this code. Feel free to give me your opinion on the changes.

Copy link
Contributor

@deeplow deeplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor comments / questions. No need to address them as they are minor.

Approved in general

Comment on lines +444 to +449
# If the reference is actually defined in subclasses, grab it from
# `self`.
if isinstance(ref, str):
_ref = getattr(self, ref)
else:
_ref = ref
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand this part. Could you explain a bit more the logic? Or give an example where this happens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh, I'm not proud of this code. An example where the problem manifests is here:

@QABase.task("Build Dangerzone image", ref="REF_BUILD", auto=True)
def build_container_image(self):

If you look carefully, I use ref="REF_BUILD", instead of ref=REF_BUILD. The reason is that during class creation, REF_BUILD = None. To accomodate for the REF_BUILD that the subclasses later define, I had to resort to a dynamic approach 😞 .

dev_scripts/env.py Show resolved Hide resolved
@apyrgio apyrgio force-pushed the 286-linux-envs branch 2 times, most recently from f0c3c82 to 57bd20e Compare January 16, 2023 12:01
Skip the creation of the `share/container.tar` file, since it's not used
anywhere. Instead, pipe our `docker/podman save` invocations to `gzip`
directly, which will compress the tarfile on the fly. This saves both
time and disk space.
Introduce `dev_scripts/env.py`, which is a script for building
Dangerzone environments for various Linux distros, and running commands
in them.

Closes #286
Add a design document for `dev_scripts/env.py`, which is a script that
creates Dangerzone environments for various Linux distros. In this
design document, we explain various architectural decisions that we have
taken for this script, as well as how it works under the hood, what are
its shortcomings, etc.

Refs #286
Add a script that makes the user go through the QA steps for a supported
Dangerzone platform, and may optionally run them automatically, if the
user agrees.

Closes #287
Add a short explanation of what is the purpose of the QA script, and
what it uses underneath.

Refs #287
Ignore the `dev_scripts/envs` folder when running tests or linting code,
as it may contain files that are not owned by the current user. In this
case, we've seen that pytest/black etc. fail.

This typically happens when the user has run Dangerzone in a
containerized environment (see #286), and Podman created a directory
with files owned by the user in the nested container.
Narrow down the system packages that we install in dev environments. The
rationale is that we get most of the Python dependencies from Poetry, so
we don't need to install them from the system as well.

The packages that we do need to install are non-Python ones, and this
commit adds some that were missing: make, python3-stdeb. Also, we
explicitly install the base Qt5 libraries, in order to get the graphics
and C++ libraries that we can't get from PyPI.
Use the `dev_scripts/env.py` script to run CI tests for some platforms
we couldn't run before.
Fedora 35 has reached its end of life [1], so we remove it from our CI
builds.

Closes #308

[1]: https://endoflife.date/fedora
@apyrgio
Copy link
Contributor Author

apyrgio commented Jan 16, 2023

I've rebased the branch, and I've removed the Fedora 35 support, so now our build completes successfully.

@apyrgio apyrgio merged commit 7d0b6d4 into main Jan 16, 2023
@deeplow deeplow deleted the 286-linux-envs branch May 11, 2023 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add CI tests for more Linux environments Automate the creation of Linux environments for Dangerzone testing
2 participants