-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store AMI observables in OIFITS format products #5351
Comments
Comment by Rachel Cooper on JIRA: Nadia Dencheva requested the following: 1. the code from the standalone algorithm that we currently use to read/write OIFITS files, and 2. an example file that can be used for testing.
|
Comment by Brett Graham on JIRA: Rachel Cooper Thanks for sharing the example files. I've been taking a look at them, the OIFITS standard and doing some testing with stdatamodels and jwst to see what will be required to make an OIFITS compliant output. Is there software that can be used to verify that the output is OIFITS compliant? I've found 2 possible options, each with issues:
|
Comment by Rachel Cooper on JIRA: Hi Brett, Thanks for digging into this. We haven't been using any kind of verification software unfortunately. I have used a version of the same JMMC oifits validator (the OIFITSExplorer GUI runs it when you open files I think) but I didn't find an obvious python code that would work out of the box so we haven't used anything like that as part of creating the files. At some point in the past we were using pboley's oifits package in our ImPlaneIA to write the files; I'm not entirely sure why we stopped but possibly because we wanted to include additional info that should be acceptable for the OIFITS2 standard but wasn't supported by that code. |
Comment by Brett Graham on JIRA: Hi Rachel, Thanks for the follow up and extra information. The data model schema can provide some checks for OIFITS compliance (making sure certain fits keywords and tables are defined, etc). Developing this schema will benefit from having an independent means of verifying that the produced files are compliant (so we can find bugs in the schema). I'm not sure at this point if the datamodel schema will be sufficient for testing complete OIFITS compliance (there are expectations that the schema cannot easily enforce such as the cross-referencing described in the OIFITS paper one example being each OI_VIS table having a keyword INSNAME that corresponds to a keyword INSNAME in an OI_WAVELENGTH table). I was able to find a third option for checking a fits file for OIFITS compliance: https://github.com/jsy1001/oifitslib However, multi_jw01093012001_03102_00001_nis.oifits fails with errors related to the data tables (see errors below). Would you help me to understand the data in this file? Looking at the OIFITS standard, I would expect the OI_WAVELENGTH table to have rows equal to the number of spectral channels (referred to in the document as NWAVE). This table in the multi_* file contains 1 row. This should mean that the data tables, such as OI_VIS2 should contain data columns, such as VIS2DATA, with 1 element per row. However for this file, OI_VIS2 VIS2DATA contains 69 elements. I believe this is causing the failures in compliance tests for the OIFITS standard. Are the data shapes for these columns an error?
|
Comment by Brett Graham on JIRA: I was able to construct a schema for the files like the example "jwst_jw01093012001_03102_00001_nis.fits" and opened a PR for stdatamodels adding the schema and an 'AmiOIModel' which uses the schema (and adds some additional logic to copy existing file metadata to fits keywords compatible with OIFITS). I included portions of the data from this example file in a test that generates a file that passes schema validation and produces a file compatible with 'oifits-check' from oifitslib. Please feel free to take a look at the PR. I'm happy to answer any questions and to discuss next steps. |
Comment by Rachel Cooper on JIRA: Hi Brett, The multi_* file data arrays contain interferometric observables for each integration. Generally these would be averaged together, as in the regular OIFITS files, but we have found it helpful especially in these early days to also provide all the non-averaged integrations for a more detailed look into the stability of the data and allow for more more careful data cleaning. I think the OIFITS standard expects that if there are multiple columns in the observable arrays, they should be for separate spectral channels, but in our case it is just a single exposure in a single filter. Maybe we could get around this by setting NWAVE = NINTS or repeating the central wavelength where necessary or something? |
Comment by Brett Graham on JIRA: Hi Rachel, Great idea to try duplicating the central wavelength. If I load the multi_* file and:
the file is closer to conforming. However oifits-check reports errors with the OI_T3 table containing unnormalized triple product amplitudes: WARNING! Expecting format 32A but found 16A for column 'TARGET'
WARNING! Expecting format 32A but found 16A for column 'SPECTYP'
OIFITS version 2 data:
ORIGIN = 'STScI'
DATE = '2022-06-05'
DATE-OBS= '2022-06-05T00:00:00.000'
TELESCOP= 'JWST'
INSTRUME= 'NIRISS'
OBSERVER= 'UNKNOWN'
OBJECT = 'AB DOR'
INSMODE = 'NRM'
OBSTECH = ''
1 OI_ARRAY tables:
#1 ARRNAME='g7s6' 7 elements
1 OI_WAVELENGTH tables:
#1 INSNAME='NIRISS' 69 channels 4817.0- 4817.0nm
0 OI_CORR tables:
0 OI_INSPOL tables:
1 OI_VIS tables:
#1 DATE-OBS=2022-06-05T00:00:00.000
INSNAME='NIRISS' ARRNAME='g7s6' CORRNAME=''
21 records x 69 wavebands
1 OI_VIS2 tables:
#1 DATE-OBS=2022-06-05T00:00:00.000
INSNAME='NIRISS' ARRNAME='g7s6' CORRNAME=''
21 records x 69 wavebands
1 OI_T3 tables:
#1 DATE-OBS=2022-06-05T00:00:00.000
INSNAME='NIRISS' ARRNAME='g7s6' CORRNAME=''
35 records x 69 wavebands
0 OI_FLUX tables:
*** Does not conform to the OIFITS standard:
OI_T3 table may contain unnormalised triple product amplitude, 1662 occurrences:-
OI_T3 #1 record 3 channel 1
OI_T3 #1 record 3 channel 2
OI_T3 #1 record 3 channel 3
OI_T3 #1 record 3 channel 4
OI_T3 #1 record 3 channel 5
OI_T3 #1 record 3 channel 6
OI_T3 #1 record 3 channel 7
OI_T3 #1 record 3 channel 8
OI_T3 #1 record 3 channel 9
[List truncated] I'm not seeing any mention of needing these to be normalized in the paper (although it appears oifitslib wants the T3_AMP values to be <=1). Running this same file against the online checker shows no errors but does show numerous warnings about 0.0 values that should be NaN (or flagged out). Here is one example. |
Comment by Rachel Cooper on JIRA: Hi Brett, Good, I'm glad that seems to have worked! Theoretically T3_amp values > 1 are nonphysical, but I don't think we want to normalize them currently. The all-zero error tables are kind of there as a placeholder because there have to be error estimates on the observables but since we've been calculated errors across integrations we just populated the multi-int file error tables with zeros. I think they could probably be NaNs without breaking anything too badly if that would satisfy the standard. output_model = datamodels.AmiLgModel(
fit_image=nrm.modelpsf,
resid_image=nrm.residual,
closure_amp_table=np.asarray(nrm.redundant_cas),
closure_phase_table=np.asarray(nrm.redundant_cps),
fringe_amp_table=np.asarray(nrm.fringeamp),
fringe_phase_table=np.asarray(nrm.fringephase),
pupil_phase_table=np.asarray(nrm.fringepistons),
solns_table=np.asarray(nrm.soln)) ```
I have code where I've gathered all the updated header keys and data tables into a dictionary, but I don't quite see how the datamodel schema defines the way the data given to it is used to populate the model – i.e. in the above example, where are the keyword arguments to datamodels.AmiLgModel() defined?
I also wanted to let you know that I'll be out of communication June 30-July 14, but will return to working on this when I get back.
Thanks! |
Comment by Brett Graham on JIRA: Hi Rachel, Thanks for taking a look at the updates. It is good to know that converting the 0s to NaNs shouldn't be an issue (I don't believe this is something the schema can enforce and the same for the >1 T3_AMP values). The relationship between keyword arguments to the model and the schema values is a bit convoluted. I will try to explain things (to the best of my knowledge) here but feel free to let me know if a discussion might help to clear things up. The keyword arguments get turned into attribute assignments, so using the old AmiLgModel as an example: m = datamodels.AmiLgModel(fit_image=[[1, 2]])
# is the same as
m = datamodels.AmiLgModel()
m.fit_image = [[1, 2]] The requirements (type, shape, etc) for these arguments come from the schema. Sticking with the 'fit_image' example, here is it's entry in 'amilg.schema.yaml'
The schema also defines where these attributes will be saved in the resulting fits file (for the fit_image example it will be stored in the 'FIT' fits extension). For the new oifits compatible files, many of these attributes will likely need to change or be mapped to new fits extensions and keywords. For defining the new schema I attempted to base names off of the oifits standard. The test function added in the PR is likely a good starting point: m = datamodels.AmiOIModel()
m.meta.telescope = 'JWST'
m.meta.origin = 'STScI'
m.meta.instrument.name = 'NIRISS'
m.meta.program.pi_name = 'UNKNOWN'
m.meta.target.proposer_name = 'AB DOR'
m.meta.observation.date = '2022-06-05'
m.meta.oifits.array_name = 'g7s6'
m.meta.oifits.instrument_mode = 'NRM'
m.array = ... # truncated for readability, see the example for the assigned values
m.target = ...
m.t3 = ...
m.vis = ...
m.vis2 = ...
m.wavelength = ... Some of the above metadata overlaps with what is already used in the jwst pipeline ("m.meta.telescope", "origin", "instrument.name", etc). These need to be defined for the file to be oifits compliant (note that m.meta.observation.date stores a date, not a date time, it is unclear if this violates the oifits standard, none of the checkers complained). Does this help to describe how to use the new model? I made quite a few guesses at names (all of this can be changed) and tried to keep the attributes to the minimum required for oifits compliance (with the expectation that more would need to be added informed by the details of the processing step). One option for proceeding might be to merge this PR (it doesn't interfere with other steps) to be able to start using it in a JWST PR adding the new processing step. Then address any changes with future PRs. How does that sound to you? |
Comment by Rachel Cooper on JIRA: Hi Brett, |
Sounds great! I merged the PR. There is an open issue about updating the 'Creating a new model' docs that I'll xref here: #4436 |
Comment by Rachel Cooper on JIRA: Hi Brett, cannot import name 's3_utils' from 'stdatamodels' (/Users/rcooper/miniconda3/envs/ami3test/lib/python3.9/site-packages/stdatamodels/__init__.py)```
I'm not sure if this is due to how it's being imported somewhere in the jwst.ami code? I'm using stdatamodels version 1.7.1. I've attached the full traceback in case that's helpful.
[^stdatamodels_import_traceback.txt]
|
Comment by Brett Graham on JIRA: Hi Rachel, Thanks for sharing the traceback. It appears the stpipe version you have is out-of-date. Can you try installing the newest (version 0.5.0)? If that doesn't fix the issue would you share the output of |
Comment by Rachel Cooper on JIRA: Yep, that did it! Thanks :D |
Comment by Rachel Cooper on JIRA: Hi Brett, |
Comment by Brett Graham on JIRA: Hi Rachel, Hopefully I can address your question here but feel free to let me know if you think discussing this work would be helpful. The 'm.array' in this case is a numpy structured array and should work with any of the tools for that structure. If you're constructing an array column-by-column (knowing ahead of time how many rows will be in the table) you could try initializing a empty array (note that this will have seemingly random data but this could be filled with 0s, nan or some other value that is easier to distinguish from data). import numpy as np
from stdatamodels.jwst.datamodels import AmiOIModel
model = AmiOIModel
n_rows = 100
model.array = np.empty(n_rows, dtype=model.array.dtype) With the initialized array you can then assign values to columns using the column name mode.array['TEL_NAME'] = 'JWST' If you don't know the size of the array ahead of time it's possible to append rows. However this can result in redefinition and copying of memory (if the array grows beyond the bounds of the initially allocated memory) so it might be more efficient to pre-construct the data before assigning it to the array. Does this help to answer your question? -- |
Comment by Rachel Cooper on JIRA: Hi Brett, m = datamodels.AmiOIModel()
nrows = 7
m.array = np.zeros(nrows, dtype=m.array.dtype)
m.array['TEL_NAME'] = np.expand_dims(np.array(self.oifits_dct['OI_ARRAY']['TEL_NAME']),axis=0)
m.array['STA_NAME'] = np.expand_dims(np.array(self.oifits_dct['OI_ARRAY']['STA_NAME']),axis=0)
m.array['STA_INDEX'] = np.expand_dims(np.array(self.oifits_dct['OI_ARRAY']['STA_INDEX']),axis=0)
m.array['DIAMETER'] = np.expand_dims(np.array(self.oifits_dct['OI_ARRAY']['DIAMETER']),axis=0)
#m.array['STAXYZ'] = np.array(self.oifits_dct['OI_ARRAY']['STAXYZ'])
m.array['FOV'] = np.expand_dims(np.array(self.oifits_dct['OI_ARRAY']['FOV']),axis=0)
m.array['FOVTYPE'] = np.expand_dims(np.array(self.oifits_dct['OI_ARRAY']['FOVTYPE']),axis=0)
m.array['CTRS_EQT'] = np.array(self.oifits_dct['OI_ARRAY']['CTRS_EQT'])
#m.array['PISTONS'] = np.expand_dims(np.array(self.oifits_dct['OI_ARRAY']['PISTONS']),axis=0)
#m.array['PIST_ERR'] = np.expand_dims(np.array(self.oifits_dct['OI_ARRAY']['PIST_ERR']),axis=0)
print('got past here') ```
Then, when I comment out the STAXYZ column definition it complains that the model.array['PISTONS'] and ['PIST_ERR'] keys don't exist, and that I think is coming from the model schema, since that was something we added since that sample file I sent you was produced. That is part of what will differ between the raw and calibrated OIFITS files: raw oifits files will have PISTONS and PIST_ERR columns in the OI_ARRAY extension, while calibrated fits will have 'PIST_T', 'PIST_C', and 'PIST_ERR' columns. I'm not sure if those piston columns can be defined as optional in the schema or if it is preferable to create a separate schema for the calibrated version of these files that will be produced by the ami_normalize step?
Thanks! |
Comment by Brett Graham on JIRA: Hi Rachel, I think there might be 2 (maybe 3) issues here.
dtype((numpy.record, [('TEL_NAME', 'S16'), ('STA_NAME', 'S16'), ('STA_INDEX', '<i2'), ('DIAMETER', '<f4'), ('STAXYZ', '<f8', (3,)), ('FOV', '<f8'), ('FOVTYPE', 'S6'), ('CTRS_EQT', '<f8', (2,))])) The easiest way that I know of to defined a dtype for a structured array is to define it as a list of tuples. So instead of using modified_dtype = np.dtype([('TEL_NAME', 'S16'), ('STA_NAME', 'S16'), ('STA_INDEX', '<i2'), ('DIAMETER', '<f4'), ('STAXYZ', '<f8', (3,)), ('FOV', '<f8'), ('FOVTYPE', 'S6'), ('CTRS_EQT', '<f8', (2,))]) and use it by modifying a few lines from your example (I won't include them all to make it more readable) nrows = 7
m.array = np.zeros(nrows, dtype=modified_dtype)
m.array['TEL_NAME'] = np.expand_dims(np.array(self.oifits_dct['OI_ARRAY']['TEL_NAME']),axis=0)
|
Comment by Rachel Cooper on JIRA: Hi Brett, |
Comment by Brett Graham on JIRA: Hi Rachel, Thanks for the follow up. I asked around and there is not currently a way to define optional columns. I have a PR in the works that should allow for this but this will require some more testing before it's ready for review: spacetelescope/stdatamodels#189 I am hoping to start the tests later today and if all goes well open it for review. The changes in the PR should allow you to add extra columns to the oifits data tables (array, vis, vis2, etc) which the oifits paper says is allowable by the standard. For the multi-integration version are there errors you are encountering when constructing the model? If I recall correctly, the example file you shared had some columns with shapes that violated the oifits standard (and I'm not sure if this is checked by the schema). It might make sense to run the files generated by the model against one of the oifits checkers to verify that they comply with the standard. Another example file (or more if there were modifications to the multi-integration files to try and make them compliant) would be great if it's not too much work. If the above PR doesn't make significant progress in the next few days (if that timeline works for you) we can go with a fall-back PR that includes a new schema for the calibrated files (which should have no impact on other jwst code). Please let me know what I can do to help with this work and feel free to reach out. |
Comment by Rachel Cooper on JIRA: Hi Brett, |
Comment by Brett Graham on JIRA: Good morning! As an update to this, the PR adding support to |
Comment by Rachel Cooper on JIRA: Fantastic, thanks so much for doing that so quickly! I'll update my code to use those changes. |
Comment by Rachel Cooper on JIRA: Hi Brett, |
Comment by Brett Graham on JIRA: Hi Rachel, That is correct. The
Will generate Let me know if you're running into any more issues and/or if more details would be helpful. |
Comment by Rachel Cooper on JIRA: Hi Howard Bushouse, I've been able to populate the new AMI datamodels and run the updated AmiAverage and AmiNormalize steps in my environment (thanks to Brett Graham for all the datamodel help!) I can save the outputs of the steps with the save_results=True option, but now that there are three outputs from the AmiAnalyze step they get saved with '_0_amianalyzestep.fits', '_1_amianalyzestep.fits,' etc by default, at least when I run the step independently. I was looking through the code library for what I need to update to create the appropriate suffixes for the files, but I was wondering if there is a precedent for a step having multiple file outputs that I could look at? |
Comment by Richard A. Shaw on JIRA: There seems to be a misconception running through this ticket and the lengthy discussion. The products being discussed here are FITS format, and should have the extension ".fits" as for all files in MAST that adhere to the FITS standard. The work here suggests you are following a FITS convention for optical (now, also IR) interferometry, which is fine. But the convention specifies particular content, and organization of that content within a FITS file. Such products should pass fitsverify without difficulty if the implementation is correct. |
Comment by Howard Bushouse on JIRA: Richard A. Shaw Agreed! From my bit of research into the OIFITS format it is simply a convention for how to organize interferometry data within FITS-compliant files. The only thing I wasn't completely sure of is whether the file names still use the standard ".fits" extension (as opposed to ".oifits"). Perhaps they must use the standard ".fits" file name extension in order to qualify as FITS-compliant files. |
Comment by Brett Graham on JIRA: It is also my understanding that oifits is a more strict version of fits. Every oifits file must be a valid fits file but additionally has to have certain tables, keywords, etc. I just checked with the online oifits validator and offline with the oifitslib checker on the above linked example files after moving them to a |
Comment by Rachel Cooper on JIRA: Correct, OIFITS files are just FITS files with particular structure and contents, but people familiar with them would expect them to have the .oifits extension rather than just .fits. Would that be a problem for MAST? |
Comment by Richard A. Shaw on JIRA: Yes, it would be a problem for MAST. It is used in pretty much all of MAST to convey the file format. It may be better for the "OI" case to use a new file suffix (i.e., the part of the filename that appears between the last underscore and the ".fits" extension) to indicate a new semantic type. |
Comment by Rachel Cooper on JIRA: Would that be in addition to the new suffixes I've proposed for these files? So instead of e.g. |
Comment by Richard A. Shaw on JIRA: I have not reviewed this ticket closely enough to notice what new file suffixes are being suggested. But the suffix itself must not contain an underscore. Hyphens (e.g., "_ami-oi.fits") are ok though, I believe. |
Comment by Rachel Cooper on JIRA: My comment on this ticket posted on 9/1 at 3:12 PM (just before your first comment today) gives the suffixes I suggested for the updated outputs of the AMI3 steps. I think looping in Howard Bushouse into a more active discussion now would be helpful for efficiency's sake, because it sounds like simultaneous updates to the code, documentation, and MAST are all required here. |
Comment by Brett Graham on JIRA: stdatamodels now supports oifits files. The pipeline updates to support these files is in progress. |
Comment by Rachel Cooper on JIRA: Hi Brett Graham, Sorry to bring this ticket back to life again, but we're trying to make the final tweaks to get the code that produces the new OIFITS files up to snuff for MAST, which apparently means that the header keywords from the input have to be copied to the outputs. I currently get this error: 2023-11-02 13:58:50,581 - stpipe.AmiAnalyzeStep - WARNING - /System/Volumes/Data/user/rcooper/Projects/NIRISS/AMI/ami3_update/stdatamodels/src/stdatamodels/validate.py:38: ValidationWarning: While validating <AmiOIModel(35,) from jw01093012001_03102_00001_nis_calints.fits> the following error occurred:'derived' is a required property Failed validating 'required' in schema['properties']['meta']['properties']['oifits']: OrderedDict([('type', 'object'), ('required', ['array_name', 'instrument_mode', 'derived']), ('properties', OrderedDict([('array_name', OrderedDict([('title', 'Array name'), ('type', 'string'), ('fits_keyword', 'ARRNAME'), ('fits_hdu', 'OI_ARRAY')])), ('instrument_mode', OrderedDict([('title', 'Instrument mode'), ('type', 'string'), ('fits_keyword', 'INSMODE')])), ('derived', OrderedDict([('type', 'object'), ('required', ['array', 'content', 'object', 'observer', 't3', 'target', 'vis', 'vis2', 'wavelength']), ('properties', OrderedDict([('array', OrderedDict([('type', 'object'), ('required', ... warnings.warn(errmsg, ValidationWarning) ```
The line I used to do this was: oifitsmodel.update(input_model, only="PRIMARY")``` |
Comment by Brett Graham on JIRA: Hi Rachel, Happy to help! I wasn't aware of |
Comment by Rachel Cooper on JIRA: Hi Brett, thanks so much for your quick action on this. I just tried the changes and that seems to have solved it! Should I leave an approving review on the PR? |
Comment by Brett Graham on JIRA: Thanks for testing. An approving review would be great. I will open it for review and queue up a regression test run. |
Comment by Rachel Cooper on JIRA: I think I'm not currently able to review the PR since I'm not one of the spacetelescope/stdatamodels-maintainers? I may be missing something though. |
Comment by Brett Graham on JIRA: Thanks for letting me know. Feel free to leave a comment on the PR if it looks good and I'll ping some other maintainers for approval (to allow the branch protection to pass). |
Comment by Brett Graham on JIRA: Howard Bushouse I am closing this ticket per the instructions https://github.com/spacetelescope/jwst/wiki/How-to-resolve-JIRA-issues under the assumption that you created this and it's finished "testing". Let me know if I made any mistakes here in closing it. |
Comment by Rachel Cooper on JIRA: Testing done; AMI observables are now saved in "_ami-oi.fits" (also "_amimulti-oi.fits", "aminorm-oi.fits") products which adhere to the OIFITS standard. Commissioning data (PID 1093) used for testing. |
Issue JP-1713 was created on JIRA by Howard Bushouse:
In addition to the overall algorithm upgrades to AMI processing requested by the AMI WG in JP-1449, they have also requested that the results ("observables") of each step be stored in products that adhere to the OIFITS convention (see https://fits.gsfc.nasa.gov/registry/oifits.html).
The text was updated successfully, but these errors were encountered: