-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove metabolism constraints file #111
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot.
See the comments on compile+eval.
models/ecoli/processes/metabolism.py
Outdated
'enzymes': enzymes, | ||
'kineticsSubstrates': substrates, | ||
} | ||
exec self.compiledConstraints in local |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exec statement says providing one dictionary to exec
uses it for both the globals and the locals. If locals are in fact faster than globals, then it's worth providing {}, local
.
Might as well use the Python 3 compatible syntax exec(self.compiledConstraints, global, local)
.
Actually, this module has already imported numpy so there's no need to compile code that redoes that. Just pass np
in one of the dictionaries. It might be worth testing whether the locals or globals dictionary is faster for that -- np
is usually a global.
And with that you can just compile and eval() an expression, which is simpler and I'd expect it to be more efficient since eval()
never has to modify its locals dictionary. (Note that the expression source code must end with a \n
.)
if not self.compiledConstraints:
self.compiledConstraints = compile('np.array(%s).reshape(-1)\n' % self.kineticConstraintsStr, '<string>', 'eval')
local = {
'enzymes': enzymes,
'kineticsSubstrates': substrates,
'np': np,
}
return eval(expr, {}, local)
Was the 8x performance penalty with eval()
due to evaluating a string multiple times instead of compiling it once?
Does using eval()
instead of exec()
let you move the compile & eval code into sim_data
?
"reconstruction", "ecoli", "dataclasses", "process", "metabolism_constraints.py" | ||
) | ||
writeMetabolicConstraintsFile(constraintsFile, constraints) | ||
self.kineticConstraints = str(sp.Matrix(constraints))[7:-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 7 and -1? Should it use len(something)
or symbolic constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just carry over from the kinetic_constraints.py file. Creating the sp.Matrix object adds Matrix() around the string so the indexing just strips that. It looks like all we need to do is turn the constraints into a string and then we don't need to index or reshape the numpy array on the function call so I can update that
Even better:
This proves to be faster the first time and subsequent times.
(Another variation passes in About pickling Test code
|
BTW, the penalty for v1 mostly goes away by changing That's really surprising since the first-run step |
Thanks for doing the testing Jerry! It seems like I can create the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely an improvement, although I don't think I've ever used exec
or compile
so it's a bit mysterious to me.
If you want to evaluate the kinetic constraints in a more algorithmic fashion (e.g. something that could be assembled and then evaluated in Cython) let me know. My parameter estimation work provides guidelines for disassembling a kinetic rate law into standard component parts. This would also put you one (big) step closer towards being able to use my parameter estimation techniques. 😉
models/ecoli/processes/metabolism.py
Outdated
|
||
def getKineticConstraints(self, enzymes, substrates): | ||
''' | ||
Allows for dynamic programming for kinetic constraint calculation from sim_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't call this "dynamic programming" - dynamic programming is a specific algorithmic technique.
models/ecoli/processes/metabolism.py
Outdated
Returns np.array of the kinetic constraint target for each reaction with kinetic parameters | ||
Inputs: | ||
enzymes (np.array) - concentrations of enzymes associated with kinetics constraints | ||
substrates (np.array) - concentrations of substrates associated with kinetics constraints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documenting inputs is great, something I'm also trying to do more. With arrays I like to annotate the dtype and shape if possible.
This looks good! Revised notes on pickling:
|
models/ecoli/processes/metabolism.py
Outdated
@@ -111,8 +111,7 @@ def initialize(self, sim, sim_data): | |||
self.catalyzedReactionBoundsPrev = np.inf * np.ones(len(self.reactionsWithCatalystsList)) | |||
|
|||
# Data structures to compute reaction targets based on kinetic parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe reword "Data structures" as "Function" or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I feel like it's easy to forget about comments when updating code. Probably something we need to be more careful about moving forward so we don't get mismatches between what the comments say and what the code actually does
I've been playing around with kinetic parameters and because we currently use a dynamically created file, I needed to rerun the fitter to generate the file when I wanted to switch back and forth between sims. This was my attempt at keeping the dynamically generated code more self contained especially since we aren't tracking the files that are dynamically generated. I had wanted to keep the
compile
andexec
commands contained within sim_data but that prevented the pickling of the object. I could have done it with theeval
command but that is about 8x slower than compiling once and runningexec
. I'm not sure how I feel about relying onexec
and feel like there should be a cleaner way to do this. If this implementation looks good though, we could probably do roughly the same thing with the ode constraints to remove those dynamically generated files as well.I found this site useful in setting this up: http://lucumr.pocoo.org/2011/2/1/exec-in-python/