-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce POT for energy files #396
base: staging
Are you sure you want to change the base?
Conversation
- Implement validation of intermediate checkpoints during MD simulation - Add new methods to get and validate intermediate checkpoints - Update constants with checkpoint validation parameters - Modify validator functions to support intermediate checkpoint validation - Enhance error handling and logging for checkpoint validation process
…king - Implement intermediate checkpoint submission method in FoldingMiner - Add SequentialCheckpointReporter to track checkpoints across simulation states - Update SimulationManager to calculate sequential checkpoint numbering - Enable retrieval and submission of specific checkpoint files
- Replace MDOutput Pydantic model with a standard dictionary - Update JobSubmissionSynapse to work with plain dictionary instead of Pydantic model - Modify deserialization logic to handle dictionary-based md_output - Remove unnecessary Pydantic import
@@ -78,12 +76,17 @@ def deserialize(self) -> int: | |||
self.md_output = {} | |||
else: | |||
md_output = {} | |||
for k, v in self.md_output.items(): | |||
try: | |||
md_output[k] = base64.b64decode(v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you just have done md_output[k] = base64.b64decode(v) if v is not None else None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, I just copied what we had for the JobSubmissionSynapse
responses = validator.dendrite.query( | ||
synapse=synapse, axons=[axon], deserialize=True | ||
) | ||
return responses[0].cpt_files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on the code this looks like responses[0].cpt_files
is a tuple... Is this true? If true, then I think it would be better to return each value individually to be clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a dictionary
def name(self) -> str: | ||
return "SyntheticMD" | ||
|
||
def validate_intermediate_checkpoints(self, validator, job_id, axon): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
soooo much of this class is the same at the regular pipeline we have. Can we abstract the pipeline such that we don't have to iterate within this, but we give the pipeline a checkpoint file and whatever else it needs to do the validation? Then we would iterate at the higher level and just call the pipeline? this was my original idea when creating the registry, so you can just call
for data in X:
evaluation_registry[task].validate(**data)
- Improve file attachment logic in attach_files_to_synapse - Add robust log file handling with header validation and combination - Update checkpoint saving in SimulationManager - Fix checkpoint reporter filename generation - Add more comprehensive error handling and logging
- Add traceback logging for intermediate checkpoint validation errors - Add logging for participation and step execution in validators - Update checkpoint selection logic to use the last step number - Add axons parameter to get_energies function
…/folding into features/energy-POT
Adjust the random checkpoint selection range to prevent potential index out of bounds error by subtracting 1 from the maximum checkpoint number
…e submission - Refactor log file combination using pandas for more robust processing - Add sorting of log files by step number - Simplify file attachment logic - Add logging for intermediate checkpoint submission
…/folding into features/energy-POT
- Convert validation methods in SyntheticMDEvaluator to async - Update method calls to use await for async operations - Modify checkpoint validation to check for exact number of checkpoints - Update error message for insufficient intermediate checkpoints - Adjust forward method calls to use async syntax in validators
… FoldingMiner - Extend BaseMinerNeuron with a new intermediate_submission_forward method - Update FoldingMiner to base64 encode checkpoint files for submission - Modify checkpoint file naming to use sequential numbering - Add cleanup for temporary combined log file after processing
…/folding into features/energy-POT
- Move the check for running simulations to a later point in the method - Ensure the event is updated correctly before returning simulation status - Maintain functionality for checking local storage for simulation data
…idation - Added logging for validity checks on final and intermediate checkpoints. - Updated checkpoint validation to include specific checkpoint numbers. - Refactored energy extraction logic to handle final and intermediate checkpoints separately. - Improved event dictionary structure to accommodate new energy data formats.
- Update the energy retrieval logic to use final miner energies instead of intermediate steps. - Remove redundant code related to max_step calculations for improved clarity.
- Remove the creation of the state file during initialization and instead generate it dynamically based on the checkpoint number. - Update the save and load state methods to use the newly constructed state file path. - Clean up redundant comments related to state file management.
- Update the log file naming convention to use the checkpoint number instead of the current state for improved clarity and consistency. - Remove the unused max_step variable to streamline the energy extraction logic.
…factor Refactor of the evaluator
No description provided.