-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change flock command argument #358
Conversation
When looking at the manual of the
|
What version of flock is installed in the ssh server container? Googling |
@danielolsen I'm using an image based on Alpine which Docker currently recommends: |
a493053
to
76736b4
Compare
@dmuldrew is the goal of the ssh server container to be a reproduction of our existing setup, something which we hope to distribute to externals users, or both? This issue seems to be fairly easy to resolve, but there may be many other smaller issues (e.g. mawk/gawk) that arise because of choices we've made about which distros we're using in which containers. |
@danielolsen I think the goal is to reduce testing with our production server and model more of our existing setup so that we can automate more integration tests with Github actions. I don't think we want to distribute to external users a platform which uses ssh? It's nontrivial to setup and is specific to our infrastructure. It's possible to make a Docker build script that will a make a container almost identical to our compute server, however that would be a lot more work to get everything to build correctly and more complex to maintain. |
Yep, the goal of the ssh container is purely for testing. For external use we'll provide a different container which uses the shared volume (so no ssh). |
If we're trying to mock our production server setup for internal testing, do we know that all commands that will work on an Alpine build will also work on Ubuntu? Otherwise we may think something will work but it could fail unexpectedly as soon as we deploy. What blocks us from using Ubuntu directly? |
I agree with @danielolsen |
9b1a1ed
to
e3e97cc
Compare
@@ -55,6 +55,13 @@ def _execute_and_check_err(self, command, err_message): | |||
:return: (*str*) -- standard output stream. | |||
""" | |||
stdin, stdout, stderr = self.data_access.execute_command(command) | |||
if len(stderr.readlines()) != 0: | |||
command_output = stdout.readlines() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's not part of your changes, but just noticed the return type in the docstring should actually be list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not a fan of passing stream references around. If you read the buffer and don't save the contents somewhere, you can potentially have later code that thinks there was no error.
Mentioning a couple points about the docker image for continuity. The ideal situation would be using the exact same image in dev/test that we use in practice. So using ubuntu would be closer to reality but still different than the server (which is not a docker image) - it wouldn't necessarily have important aspects like the nfs mount, currently installed software which is managed manually, users, etc. This could actually lead to false positives, if we put too much trust in tests that aren't truly an exact replica of the production server. I think the salient point here is we are sacrificing something, and as long as we know exactly what that is we can use the tools effectively. In this case, the docker test infra will essentially just test code that interacts with the filesystem (so any posix compliant image should work). Anything else that depends on os functionality is explicitly not tested by this setup, so we need to treat changes of that sort differently. I think that's ok based on the relative frequency (there are way more changes related to pure python than there are to the interaction with external dependencies). Last point is that I'd like to not make assumptions about what the future architecture looks like. We do have some abstract long term goals but those are (hopefully) independent of this stuff, which is basically implementation details. What I mean, is we may or may not have a server at some point, or the server could be used solely to run containers, or we may have a variable number of servers but they are in the cloud, or we might use some "containers as a service" thing, etc. What we're doing here is addressing a known short term need with a reasonable amount of effort, but avoiding trying to address a possible longer term need which would be much harder, and with potentially less (or zero) payoff. Sorry this is kind of meta but just want to make sure we collectively avoid falling into some kind of design trap (not sure exactly which one.. something about planning involving unknowns). |
Pull Request Etiquette doc
Purpose
Use more standard flag for
flock
, to correspond with the flock installed on the ssh server container. Improve error handling of executed commands to provide more information fromstderr
.What the code is doing
Saves
stdout
andstderr
buffers to variables, then printsstrerr
in theIOError
message, which forflock
error looks like the following:The provided error
Failed to generate id for new scenario
wasn't enough information diagnose the underlying issue.Testing
Automated testing of dockerized framework. Checked to make sure flock on the compute server also has an -x flag. I'm actually not sure why
flock -e
works on the compute server given the following:Time estimate
~15 min