Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a ModelLibrary subclass for jwst #8649

Closed
stscijgbot-jp opened this issue Jul 15, 2024 · 2 comments · Fixed by #8683
Closed

Implement a ModelLibrary subclass for jwst #8649

stscijgbot-jp opened this issue Jul 15, 2024 · 2 comments · Fixed by #8683

Comments

@stscijgbot-jp
Copy link
Collaborator

Issue JP-3690 was created on JIRA by Brett Graham:

Once the ModelLibrary container class is available in stpipe, implement a subclass (and tests) for jwst. The current target will be to update steps in calwebb image3 to use the new container class.

The main goal of the new container class is to provide memory-efficient mode for processing large associations which might not always be preferred (small associations are likely more efficiently processed by loading the entire association in memory). The container provides an "on_disk" setting to control if models are saved in memory or "on disk" and it may make sense to expose this setting in the pipeline (and likely in all steps that support the library).

To achieve the above goal it will be necessary that steps don't load all models from the library which might involve updating some step code. Once the scope of these step updates is determined additional tickets might be opened or the work included in this ticket.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Brett Graham on JIRA:

I added 2 attachments:

  • 240814_full_run.html
  • Screenshot 2024-08-20...

From a run of a 972 member association (~100GB input data) through calwebb_image3 using #8683 (an "on disk" library was used as input and a slightly older commit 26e5436). The pipeline succeeded and the recorded memory usage (using memray) was as shown in the attachments. The peak memory usage was 50GB and this is largely due to the context array generated during resample. Importantly for this PR, at no point does the pipeline load all input data into memory.

 

The association and data was shared with us for https://jira.stsci.edu/browse/JP-3498 and indicate that that ticket can also be closed when the linked PR is merged.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Melanie Clarke on JIRA:

Fixed by #8683

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant