-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use integer linear programming in balance_stoichiometry #86
Conversation
e14c377
to
4248e5f
Compare
4248e5f
to
3d44cf0
Compare
cc @saraaamin would you mind trying this branch? (if that sounds difficult, don't worry -- I can make a release which you can "pip install") |
will this require that i install glpk? |
cvxopt was using glpk, didn't look "under the hood" in pulp, so not sure i glpk is needed for pulp (or if it's bundled). I'm quite sure the code could be made to work with plain SymPy if that's needed -- it's "just" a matter of coding up a genreal strategy. Does "pip install pulp" work? |
yes pip install pulp worked, and i downloaded this branch and installed it. |
hmmm.. no that sounds worrisome. It should raise an exception. Could you write a minimal self-contained example returning None? |
I take it back, sorry about that i wasn't returning the right thing.
i'll extensively test with a bunch of equations and get back to you. |
I was thinking that we could merge this PR now (since it is fixing a previously failing test). |
I ran it through a reaction set, but it's only returning coefficients if the reaction is already balanced. I'm trying to find one of the reactions that is under-determined but can find one of the solutions for it and see if the fix will work for that. or is it going to throw an error if it's under-determined no matter what? |
Not quite sure I follow. Could you write a minimal example together with what you'd expect? |
@bjodah Yes you can merge it. it's working now. example:
While the solution identified is feasible, all the coeff could have been '1' and that would have minimized an Obj of summing the identified coeff values. |
found another case where the result is unbalanced and not sure why it's giving back the wrong answer:
|
Thanks, I pushed a fix for your second example. The one before though seems to be working for me: >>> balance_stoichiometry({'C21H28N7O14P2', 'C3H4O3', 'C5H9NO4'}, {'C5H6O5', 'C21H29N7O14P2', 'C3H5NO2', 'H'}, underdetermined=None)
(OrderedDict([('C21H28N7O14P2', 1), ('C3H4O3', 1), ('C5H9NO4', 1)]),
OrderedDict([('C21H29N7O14P2', 1), ('C3H5NO2', 1), ('C5H6O5', 1), ('H', 1)])) |
should i download the same branch, and run the examples again? |
That would be great! |
I hate to say this but i found more examples that should not be balanced and wrong coeff are identified for them:
in this case the coeff are correct for all atoms, only 'C' is unbalanced. and according to a colleague in biochem, this is a wrong reaction and can't be balanced another example:
|
Ah, yes. Thanks, didn't think about that. Since the linear programming solver performs a minimization there is no guarantee that the constraints are actually fulfilled. I added explicit checks which will raise a |
I will test it later today, but i'm sitting right now with my colleague, and we noticed that the below example should not balance as well. would the new fix be able to detect that?
|
Currently for n=1:
for n == 2:
(since the number of carbons are affected) |
what if i don't want to replace the n with a specific number, I just want to leave it as is? would it detect that it is unbalanced? |
Currently the parser does not translate "n" into e.g. SymPy symbols. It could be done but would require some work on the parser and |
3f7ee48
to
5720dd9
Compare
Here's a work-around: >>> balance_stoichiometry({'H2O', 'C21H29N7O17P3', 'C3H4O3'}, {'H', 'C2H2O4', 'CH2', 'C21H30N7O17P3'}, underdetermined=None)
(OrderedDict([('C21H29N7O17P3', 1), ('C3H4O3', 1), ('H2O', 1)]),
OrderedDict([('C21H30N7O17P3', 1), ('C2H2O4', 1), ('CH2', 1), ('H', 1)])) |
ok, let's keep it as is right now, and i'll try to deal with n's on my side. |
I finished testing it, and all is good. |
Thank you for the feedback. |
Thanks for helping me out. |
Thanks for considering it. I have submitted a manuscript to Journal of Open Source Software: Hopefully it will get accepted, and when (if) it is, I will update the README with a "how to cite" section. If you need a citation key before that, the second best solution is citing the zenodo doi: |
Great, good luck! |
@saraaamin the manuscript got accepted and there is now a DOI in the README if you need to cite ChemPy. |
To address gh-85.
This should in theory give canonical solutions (unless there are multiple degenerate solutions) to under-determined systems representing the stoichiometry of chemical reactions.
Currently the problems with
glpk.ilp
are:It is possible that a hand-written special purpose algorithm is still preferable to a general ILP solver.