-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI exceptions logging and more robust NetCDF closures #1084
Conversation
This should be ready to be merged! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just made two comments:
- You can remove the masking operation at the #TODO
- The changes in
sams.py
lead to clearer code---thanks!
# Open analysis file. | ||
self._storage_analysis = self._open_dataset_robustly(self._storage_analysis_file_path, | ||
mode, version=netcdf_format) | ||
# TODO - AR: What's the purpose of set_auto_mask(False)? When we create the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We no longer need it since we're not using masking!
@@ -373,13 +373,14 @@ def _mix_replicas(self): | |||
# TODO: We may be able to refactor this to simply have different update schemes compute neighborhoods differently. | |||
# TODO: Can we allow "plugin" addition of new update schemes that can be registered externally? | |||
with mmtools.utils.time_it('Mixing of replicas'): | |||
jump_and_mix = self._JumpAndMixPacket(self.n_replicas, self.n_states) | |||
# Initialize statistics. This matrix is modified by the jump function and used when updating the logZ estimates. | |||
replicas_log_P_k = np.zeros([self.n_replicas, self.n_states], np.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much clearer now---thanks!
Yank/multistate/sams.py
Outdated
for state_index in neighborhood: | ||
u_k[state_index] = self._energy_thermodynamic_states[replica_index, state_index] | ||
log_P_k[state_index] = self.log_weights[state_index] - u_k[state_index] | ||
u_k = self._energy_thermodynamic_states[replica_index, state_index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To match the equations, it would be clearer to write:
u_k = self._energy_thermodynamic_states[replica_index, :]
log_P_k[state_index] = - u_k[state_index] + self.log_weights[state_index]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Will change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
I've made a few small modifications to fix the stack trace that is logged when an exception is raised in an MPI process and to make more robust the NetCDF handling. In particular,
AlchemicalPhase.__del__
and makeyank.experiments.Experiment
force garbage collection after deleting the alchemical phase.Dataset.open/close()
in the multi-state samplers.The
switch_phase_interval
is working much better for me after these modifications, which is hopeful.