Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-3743: Make outlier detection respect weights for in-memory models #8777

Merged
merged 5 commits into from
Sep 12, 2024

Conversation

emolter
Copy link
Collaborator

@emolter emolter commented Sep 11, 2024

Resolves JP-3743

Closes #8776

This PR fixes a bug that causes the results of outlier detection to differ between the on-disk and in-memory cases any time there are weights that fall below the threshold value.

Tasks

  • request a review from someone specific, to avoid making the maintainers review every PR
  • add a build milestone, i.e. Build 11.3 (use the latest build if not sure)
  • Does this PR change user-facing code / API?
    • add an entry to CHANGES.rst within the relevant release section (otherwise add the no-changelog-entry-needed label to this PR)
    • update or add relevant tests
    • update relevant docstrings and / or docs/ page
    • start a regression test and include a link to the running job (click here for instructions)
      • Do truth files need to be updated ("okified")?
        • after the reviewer has approved these changes, run okify_regtests to update the truth files
  • if a JIRA ticket exists, make sure it is resolved properly

@emolter
Copy link
Collaborator Author

emolter commented Sep 11, 2024

@emolter emolter marked this pull request as ready for review September 11, 2024 21:51
@emolter emolter requested a review from a team as a code owner September 11, 2024 21:51
Copy link

codecov bot commented Sep 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 60.79%. Comparing base (2ab2da9) to head (f6671fd).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8777      +/-   ##
==========================================
- Coverage   60.79%   60.79%   -0.01%     
==========================================
  Files         373      373              
  Lines       38696    38701       +5     
==========================================
+ Hits        23527    23529       +2     
- Misses      15169    15172       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@emolter
Copy link
Collaborator Author

emolter commented Sep 12, 2024

The two regtest failures are caused by an unrelated PR and match the nightly runs here.

Copy link
Collaborator

@braingram braingram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Would you queue up another regtest to test the refactoring? Once that passes (or shows only unrelated failures) I think this is good to go.

Thanks for finding and fixing the bug!

@emolter
Copy link
Collaborator Author

emolter commented Sep 12, 2024

Would you queue up another regtest to test the refactoring? Once that passes (or shows only unrelated failures) I think this is good to go.

Will do later today, when Jenkins is free. Maybe that will also give time for one of the maintainers to add any comments they have. I doubt the regtests will fail if the unit tests do though, since I believe they all run with the default on_disk=True and the behavior there has not changed.

@melanieclarke
Copy link
Collaborator

melanieclarke commented Sep 12, 2024

This looks good to me - thanks especially for updating the unit test to catch this. But should we also add a regtest that runs with and without on_disk=True? If they take different code branches in other places, it might be a good idea to verify the results are always the same in both cases.

@emolter
Copy link
Collaborator Author

emolter commented Sep 12, 2024

should we also add a regtest that runs with and without on_disk=True?

Good idea. The place I found for this was in the tests of the mtimage pipeline. I think this is the best choice because (1) it covers the changes to assign_mtwcs and (2) it can be parametrized easily here because the test is isolated to calwebb_image3 and doesn't run detector1 and image2 like test_nircam_image and test_miri_image do.

Slightly unrelated, but I also changed create_median to read the on_disk status directly from the ModelLibrary rather than making it a parameter to the function. I view this as a more robust and less error-prone way of doing things. Let me know what you think.

@emolter
Copy link
Collaborator Author

emolter commented Sep 12, 2024

Another round of regression tests started here

@melanieclarke
Copy link
Collaborator

Good idea. The place I found for this was in the tests of the mtimage pipeline. I think this is the best choice because (1) it covers the changes to assign_mtwcs and (2) it can be parametrized easily here because the test is isolated to calwebb_image3 and doesn't run detector1 and image2 like test_nircam_image and test_miri_image do.

Great, thanks!

Slightly unrelated, but I also changed create_median to read the on_disk status directly from the ModelLibrary rather than making it a parameter to the function. I view this as a more robust and less error-prone way of doing things. Let me know what you think.

Sounds like a good idea to me.

@emolter
Copy link
Collaborator Author

emolter commented Sep 12, 2024

regtests are clean, can this be merged now @melanieclarke ?

Copy link
Collaborator

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge away!

@emolter emolter merged commit 3190272 into spacetelescope:master Sep 12, 2024
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Outlier detection ignores masks when in_memory is True
3 participants