-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
isisdataeval - ISISDATA Evaluation and Verification #5111
Conversation
This new application will evaluate all the files in an ISISDATA local installation for a valid kernel configuration and verify content
Testing it locally, this might be useful to run during the nightly syncs to diagnose data issues. |
My suggestion is to run this application after every ISISDATA update to your sources and make them available to users. Run the app on each $ISISDATA/{mission} directory independently and make them available as well. This provides hashes for all data and each mission after every source sync and users can now check this resource and determine any timeline of ISISDATA integrity. Some additional (perhaps Python) tools to run comparisons of the data would be super helpful. (NOTE the app contains the ability to generate a JSON file for output, too.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some small comments for now. Still messing with it.
For testing: If it is possible to test on a subset, we can test it by creating temporary kerne.db files from string literals to a temp dir similar to how we test ISDs. Then you can create empty kernel files to test things like missing, external etc. If it helps I put up boilerplate of what that would look like. But you can see some examples in tests/Fixtures.cpp
for kernel DB files using the form <b>kernels.????.db</b>. The | ||
highest version of these files are selected and its contents are read. | ||
For every <b>File</b> specification in a <b>Selection</b> group, | ||
the highested version is determined and then checked for existance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
highested -> highest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix this, thanks.
satisfy spiceinit requirements. A status of <em>external</em> | ||
indicate the file <em>as specified</em> in the db/conf file | ||
does not exist in the ISISDATA inventory. For the occurances | ||
in the above results indicate an extra '/' in the file name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to be clear, external means that it is a file NOT mentioned in any .db and therefore essential a dead kernel to ISIS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. external file status as determined in isisdateaveal has several meanings that may include:
- A file that is referred to in a kernel DB/conf file, but is not found in the full ISISDATA file inventory (it can still exist but the full path to the file must contain the expanded/specified $ISISDATA path or its deemed external). These situations are common when ISISDATA == DATADIR.
- A misspelled kernel file specification that contains an extra / in its file name (as seen in Voyager SPK kernel configs in ISISDATA Evaluation and Verification #5110). The file is present and can be read in ISIS with the file specification, but it's still identified as an issue nonetheless.
- When you choose DATADIR as a mission directory in $ISISDATA, e.g., DATADIR=$ISISDATA/osirisrex, any file specified in a kernel DB or conf file that is not contained within the $ISISDATA/osirisrex directory hierarchy (i.e., inventory), is considered external. This identifies ISISDATA dependencies outside the mission directory even if the file is in $ISISDATA but not in DATADIR - a quite useful feature. And these cases may not indicate an error, but it is good to know from a management perspective (if your ISIS mission kernel spec has outside dependencies).
Note its difficult to make assumptions in isisdataeval regarding a dead file because all files in ISISDATA are considered inventory, including calibration files, which are typically not part of the kernel infrastructure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems kinda broad, does it make sense to make these different classifications? It seems useful to flag files that match case 2 and 3 specifically rather than being pooled together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What classification do you suggest?
Case 2 is difficult to identify specifically as a misspelled file in the config file and may have different OS behaviors. It is determined by string comparisons of the environment variable expansion and the result of absolute path by OS resources via Qt::QFileInfo which seems to clean up the path.
I can add more documentation based upon this narrative if that will be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now the extra text in the XML docs that breaks down these cases is good for now.
This commit updates documentation base upon code review. Some additional modifications were made to code to clean/clarify some processing and ensure special cases are identified. Added the VERIFY parameter as its not very straight forward if you don’t add the right parameter (need to hash files to find bad filenames). This option will not require any output files to run both validation (always ran) and verification (optional).
The isisdata_mockup.py tool is provide to create an ISISDATA mockup to test the application. This submission documents the process to set up testing in the ISIS Google Test environment.
This directory contains a minimal configuration to provide testing for isisdataeval. Note that the total size of all these files is about 120K. See the README.md in ISIS3/isis/src/system/apps/isisdataeval/tools for details on how this data is created.
There are some cube files and other files that end up being excluded in .gitignore. An entry was added that included all the files in this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some questions inline.
{ | ||
"source": "$ISISDATA/hayabusa/kernels/spk/hay_osbj_050911_051118_v1n.bsp", | ||
"filesize": 81458176, | ||
"createtime": "2022-08-12T22:29:59.000000", | ||
"modifiedtime": "2022-11-23T23:37:37.651359", | ||
"md5hash": "bbc5587a59fd5d5d9683ac5d957649ce", | ||
"sha256hash": "8577f45d89d376a9bb1351b89e53c8d4e26327e465cc70b3bf361c975db53515", | ||
"sha1hash": "072e199a7b2697b97e79c1fa6aa1a186b5e03dc3" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the tests depend on having real binary kernels? Could the test setup not simply create an empty kernel? Part of the application is comparing hashes, is this hash comparison critical to the apps tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the tests depend on having real binary kernels?
Only to compute (meaningful) hash values, which is why there is a reference to the real kernel/file.
Could the test setup not simply create an empty kernel?
It could. But that does not test the validation aspect of this application. You could (and highly recommended to) pick kernels which you know its file name, size and contents should not change or file not exist (all part of verification) after any download/install. Modify the test to only compute hashes for these kernels/files and it automatically validates your environment. I don't know how to make a better/useful test.
Part of the application is comparing hashes, is this hash comparison critical to the apps tests?
Yes. This validates $ISISDATA. By definition, this framework should also be completely consistent with conditions in any valid (i.e., fully downloaded/populated) ISIS test environment since it requires $ISISDATA to run the tests.
There are two things I cannot/do not test: file times and symbolic links since support/results have proven to be inconsistent across OSes for these file/directory/device characteristics. However, these data have proven to be useful on local systems, particularly when creating/configuring new mission kernel support systems in ISIS.
With this tool and the files listed in `isisdataeval_isisdata_mockup_files.lis`, the test for isisdataeval can be recreated from any ISISDATA installation. Note that from time to time, the files used in this ISISDATA test could change which would cause failures. Here are the commands to create the `isisdataeval` test ISISDATA mockup directory - from the Git install directory: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate the thorough instructions. Is it possible to simple make this mocked ISISDATA area static and not rely on the real data for comparison? From the tests it seems it's comparing hashes, my first thought is that we don't need to compare hashes from the original file? probably something I am missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure its possible to make data up, but as I mentioned above, this does not validate $ISISDATA. That would tell us nothing about ISISDATA or isisdataeval
unless some additional effort or procedures are pursued on real data. These tests use selected files from four very stable $ISISDATA directories that should not change. This approach integrates verification and validation of required data for proper and more thorough testing of the ISIS system.
Secondarily, but just as important, this provides tools to that can help address ongoing problems with ISISDATA encountered by ISIS users by first ensuring the problem is not related to their local ISISDATA installation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on the second part, after this is merged I plan to add it to the nightly ISISDATA script and find a good way to report results in AWS such that we can get a better idea of the changes other than did download/did not download.
I think there might be a way to test the hash comparisons without real data, as the goal here is to test the app not the data itself, the data itself would be tested elsewhere. But there is already a ton of good work here and I'd rather get it in than split hairs on testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. Can you hold off for a bit on the merge.I have an update coming.
It might address some of the things you mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests look good on Jenkins.
Description
Motivation for this tool is to validate the ISISDATA installation on a local processing system. Recent changes to how the USGS provides ISISDATA to ISIS users separates the kernel configuration (USGS/AWS) from the kernel download resources (typically NAIF).
This can be a useful tool to quickly determine if spiceinit failures may be due to missing files or improper local ISIS installation. It can also be used to validate a local ISISDATA install where kernel file download filters have been applied to ISISDATA installation processing. For example, AWS S3 storage does not support symbolic links to files, so a great deal of effort has been made to identify and report these types of files (they lead to copies/redundancy of the linked file). When given the $ISISDATA directory (the default), this tool will inspect all files and directories contained in the installation for a valid ISISDATA configuration. As it pertains to isisdataeval, a valid ISISDATA installation is confirmed when all the contents of the kernel database (DB) and configuration files are translated into files that exist in the system. isisdataeval does not, however, confirm all necessary or required files have been downloaded to the local system, but it can be verified by comparing results of the application on each install of ISISDATA.
I apologize I do not have tests developed for this. Honestly, I am open for suggestions on how to test this application.
Related Issue
Related Issues:
#5110
#5109
#5105
How Has This Been Validated?
Types of changes
Checklist:
Licensing
This project is mostly composed of free and unencumbered software released into the public domain, and we are unlikely to accept contributions that are not also released into the public domain. Somewhere near the top of each file should have these words: