-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add reaction names from KEGG #367
Conversation
I compared the
It would be good to double check these to ensure they make sense, as well as spot-check several other reactions with new name assignments to confirm they're OK. |
Interesting you found more conflicts @JonathanRob, I'm not sure how I missed them. This conflict between curated data and database-fetched data is a recurring problem. My stance here is to not keep track of manual curations done on top of database-fetched data, so I'm going to propose a different approach. Instead of overwriting reaction names, I can edit the script to only fill in reaction names where none existed, and push a new commit. With this approach we avoid the need to double-check the conflicts; only the spot-checking would need to be done. |
this seems a decent solution, would be good to upload the code as well |
caccb12
to
c5ec9fe
Compare
The force-pushed commit c5ec9fe overwrites the previous work. Now, only empty reaction names are filled in. The Python script is also attached. Given the changes, I'm re-requesting reviews. |
@mihai-sysbio thanks for providing the script, whose output However, the code still reports |
The modified reaction increment is counting the attempts where a reaction with no name (
My approach has been that of a one-time curation, so I am in favor of not over-engineering the script. If we are aiming for turning this into a scripted, recurring curation, I would prefer to have a different issue, since the parsing of the |
That would be good. Agree to not over-engineer |
c5ec9fe
to
bd111e8
Compare
@haowang-bioinfo I've updated the script that now outputs (on the second run):
The latest commit bd111e8 is force-pushed because it's now rebased on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the script that now outputs (on the second run)
thanks
Main improvements in this PR:
This PR resolves #181 by taking 2000 reaction names from KEGG (the first supplied). The work was done fully scripted on Google Colab and also moved to the
/code
folder.I hereby confirm that I have:
develop
as a target branch