-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native support for zipping the swig generated python files #1144
Comments
I think it's a good idea. One thing to figure out is when a header file changes, and then the resulting python file, how do we compare a file with the file in the zip. I'm sure the team will want to find the minimal work required to work with the zip file. Anything you are adding to PYTHONPATH I'm ok with IPPython.cpp doing that so it's automatic. |
I'm working an MR for this right now! |
dbankieris
added a commit
that referenced
this issue
May 19, 2021
dbankieris
added a commit
that referenced
this issue
May 20, 2021
Hide the non-zipped Python modules to indicate to users that changing them will have no effect on the sim. Refs #1144
astrophysics
referenced
this issue
in astrophysics/trick
Nov 24, 2022
Hide the non-zipped Python modules to indicate to users that changing them will have no effect on the sim. Refs #1144
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Motivation
During simulation startup, the python input processor
import
s a lot of files that are not in user space but are generated by swig under theSIM_*/trick/
directory. For small sims, these are just a handful of files, but for large sims, like in theramtares/
workflow for example, these are thousands of files. Each time a single sim instance starts, the sim must interact with the file system to open all of these files before it even gets to the first executable line ofRUN_*/input.py
.On local filesystems, this process happens quite quickly, but Monte-carlo workflows typically necessitate runs be executed in a workspace on a network file system, since runs are not all run on the same machine and copying sims around is cumbersome. This usually means dozens, hundreds, and sometimes thousands of runs all start up near the same wall-clock time, and when each sim must read thousands of files from the network file system, this means potentially a million file open operations hitting the same network all at once.
We have seen extreme network filesystem slowness in AGDL and now FSL consistently over the years, and a recent evaluation of the NFS metrics combine with
strace
on the sim leads us to believe it is these massive amount of fileopen()
s on the network FS that are contributing to the network slowness.Reducing the file I/O by zipping
SIM_*/trick/
It has recently come to my attention that python understands how to read modules from a
.zip
file natively, and I recently prototyped this approach to replace all of our 3000 file reads with a single read oftrick.zip
. Basically this is how it works:Confirmed via
export PYTHONPATH="trick.zip"; strace -f -e trace=%file ./S_main... RUN.../input.py
, all individual file reads fromtrick/
are replaced with the single read oftrick.zip
.Complications associated with Trickified libraries -- needing to use
PYTHONPATH
The zip by itself isn't enough, you have to adjust
sys.path
so that thepython
interpreter can import the modules under the zipped directory. AlthoughTRICK_PYTHON_PATH
can typically be adjusted inS_overrides.mk
to add the zipped area, that approach does not get the entry intosys.path
early enough for Trick to find it in the case where a Trickified library's python files import trick modules likeimport sim_services
. In this scenario, the sim will die on the first input processor line that tries to reference a sim object.@dbankieris was able to help figure out why this happens, here's his explanation copied from our discussion:
The long term solution
In ramtares we are rolling our own solution by modifying our environment
PYTHONPATH
andS_overrides.mk
to manage the creation oftrick.zip
. But it would be nice if this mechanism was native to Trick. Here I'd like to discuss:SIM*/trick/
-->SIM*/trick.zip
a change that should apply globally? Should it be an option to Trick? Are there any downsides to ditching the original approach?FYI @jmpenn @spfennell @alexlin0 @dbankieris
The text was updated successfully, but these errors were encountered: