From 05a90e9e402ec2950e2982f11b26325f3410f193 Mon Sep 17 00:00:00 2001 From: "Kris J. Becker" Date: Thu, 4 May 2023 15:13:16 -0700 Subject: [PATCH] isisdataeval Updates, fixes and additional documentation (#5163) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * isisdataeval fixes, improvements, updates and docs isisdata_mockup.py - Added —tojson and —hasher parameters to improve utility make_isisdata_mockup.sh - Fixed path to test data and added new parameters to the isisdata_mockup.py introduced in this PR README.md - improved/clarified/updated documentation * CHANGELOG.md correction and update The CHANGELOG.md was incorrectly merged as the entry for this app was added in the wrong section, which seem wierd. This commit corrects and updates the log. * isisdataeval documentation update via code review * isisdataeval install $ISISROOT/bin/isisdata_mockup isisdata_mockup.py - Install this script in $ISISROOT/bin as isisdata_mockup (no extension) make_isisdata_mockup.sh - Renamed isisdata_mockup.py to isisdata_mockup CMakeLists.txt - Add install command for isisdata_mockup * isisdataeval mockup script minor docs change * Rename isisdata_mockup.py to isisdata_mockup To ease use and minimize confusion, renamed isisdata_mockup.py to isisdata_mockup (w/o the .py extension) Updated documentation and install to reflect the change * Resolved merge conflict and add CHANGELOG.md entry This push resolves a merge conflict in the CHANGELOG.md Added a entry to CHANGELOG.md to describe the PR changes. * Removed hardcoded path in isisdata_mockup --- CHANGELOG.md | 5 +- isis/CMakeLists.txt | 3 + .../system/apps/isisdataeval/tools/README.md | 156 ++++++++++-------- .../{isisdata_mockup.py => isisdata_mockup} | 156 +++++++++++++----- .../data/isisdata/make_isisdata_mockup.sh | 12 +- 5 files changed, 217 insertions(+), 115 deletions(-) rename isis/src/system/apps/isisdataeval/tools/{isisdata_mockup.py => isisdata_mockup} (60%) diff --git a/CHANGELOG.md b/CHANGELOG.md index 82d6916d39..3449d98ef8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -36,6 +36,7 @@ release. ## [Unreleased] ### Changed +- Removed the `.py` extention from the _isisdataeval_ tool `isisdata_mockup` for consistency and install it in $ISISROOT/bin; added the `--tojson` and `--hasher` option to _isisdata_mockup_ tool improve utility; updated the tool `README.md` documentation to reflect this change, removed help output and trimmed example results; fixed paths to test data in `make_isisdata_mockup.sh`. [#5163](https://github.com/DOI-USGS/ISIS3/pull/5163) ### Added @@ -49,13 +50,13 @@ release. ### Changed - Updated download location for Dawn source files to include updated pck from HAMO Dawn mosaic [#4001](https://github.com/USGS-Astrogeology/ISIS3/issues/4001) -- Pinned cspice version to 67 [#5083](https://github.com/USGS-Astrogeology/ISIS3/issues/5083) +- Pinned cspice version to 67 [#5083](https://github.com/USGS-Astrogeology/ISIS3/issues/5083) - Changed the `rsync` related commands in the ISIS SPICE Web Service document to `downloadIsisData` command - Updated Geos from version 3.7 to 3.9 [#3627](https://github.com/DOI-USGS/ISIS3/issues/3627) ### Added - Instructions on setting `channel_priority=flexible` for isis environment manually during installation [#5158](https://github.com/DOI-USGS/ISIS3/issues/5158) -- Added additional filters to downloadIsisData to reduce download of extraneous kernels [#5143](https://github.com/DOI-USGS/ISIS3/issues/5143) +- Added additional filters to downloadIsisData to reduce download of extraneous kernels [#5143](https://github.com/DOI-USGS/ISIS3/issues/5143) ### Deprecated diff --git a/isis/CMakeLists.txt b/isis/CMakeLists.txt index 153478afe5..edd414fde8 100644 --- a/isis/CMakeLists.txt +++ b/isis/CMakeLists.txt @@ -600,6 +600,9 @@ install(DIRECTORY ${CMAKE_SOURCE_DIR}/scripts DESTINATION ${CMAKE_INSTA install(PROGRAMS ${CMAKE_BINARY_DIR}/lib/Camera.plugin DESTINATION ${CMAKE_INSTALL_PREFIX}/lib/) install(PROGRAMS ${CMAKE_SOURCE_DIR}/scripts/downloadIsisData DESTINATION ${CMAKE_INSTALL_PREFIX}/bin/) install(FILES ${CMAKE_SOURCE_DIR}/config/rclone.conf DESTINATION ${CMAKE_INSTALL_PREFIX}/etc/isis/) +install(FILES ${CMAKE_SOURCE_DIR}/src/system/apps/isisdataeval/tools/isisdata_mockup + PERMISSIONS OWNER_WRITE OWNER_READ OWNER_EXECUTE GROUP_READ GROUP_EXECUTE WORLD_READ WORLD_EXECUTE + DESTINATION ${CMAKE_INSTALL_PREFIX}/bin/) # Trigger all post-install behavior. # - The only way to run commands post-install in CMake is to add a subdirectory at diff --git a/isis/src/system/apps/isisdataeval/tools/README.md b/isis/src/system/apps/isisdataeval/tools/README.md index 3f53eed23e..11822ab150 100644 --- a/isis/src/system/apps/isisdataeval/tools/README.md +++ b/isis/src/system/apps/isisdataeval/tools/README.md @@ -1,8 +1,8 @@ # ISISDATA Mockup Procedures for Testing isisdataeval -Proper, thorough testing of ISIS application `isisdataeval` is difficult due to the requirements of having a stable and flexible ISISDATA directory structure to test with. It is unfeasible to rely on the real $ISISDATA due to its volitile and every changing content. One solution is to create a mockup of the ISISDATA directory but mimimize the resources to emulate a real, functioning ISISDATA installation. The system presented here generates a complete ISISDATA directory structure at a signification fraction of the actual dataset. By careful selective culling of many of the existing files in the selection mission datasets, it creates a real time snapshot of ISISDATA to help test `isisdataeval`. +Proper, thorough testing of ISIS application `isisdataeval` is difficult due to the requirements of having a stable and flexible ISISDATA directory structure to test with. It is unfeasible to rely on the real $ISISDATA due to its volitile and every changing content. One solution is to create a mockup of the ISISDATA directory but mimimize the resources to emulate/represent a real, functioning ISISDATA installation. The system presented here generates a complete ISISDATA directory structure at a signification fraction of the actual dataset. By careful selective culling of many of the existing files in the selection mission datasets, it creates a real time snapshot of ISISDATA to help test `isisdataeval` and validate the local ISISDATA install. -The ISISDATA mockup system is a complete copy of every directory and every file in an existing ISISDATA install. However, the contents of every file has been replaced with information about that file, keeping the size of every file to about 300 bytes each. Here is the format of a file prepared by this system: +The ISISDATA mockup system is a complete copy of every directory and every file in an existing ISISDATA install. However, the contents of every file has been replaced with information about that file, keeping the size of every file to about 400 bytes each. Here is the JSON format of a mocked ISISDATA file prepared by this system: `cat isisdatamockup/base/kernels/spk/de118.bsp` ``` @@ -17,99 +17,117 @@ The ISISDATA mockup system is a complete copy of every directory and every file } ``` -The size of this file has been reduced to 376 bytes from 4,097,024 bytes. Each file will contain JSON text with details about the original file. The files size, creation and modified dates, and hash values for the md5, sha1 and sha256 hash algorithms computed from the contents of the file using Python tools - and independt. The hash values are expressly provided to provide comparisons of the Qt algorithms used in `isisdataeval`. Note it is possible, but cannot be guaranteed, the _source_ for the file will exists in every ISISDATA install, or even be the same file. But this is a good thing to test. +The size of this file has been reduced to 376 bytes from 4,097,024 bytes. Each file will contain JSON text with details about the original file. The files size, creation and modified dates, and hash values for md5, sha1 and sha256 hash algorithms are computed from the contents of the file using Python tools. This provides a an external source (Python) comparison of the Qt hash algorithnms used in `isisdataeval` for validation tests. Note it is possible, but cannot be guaranteed, the _source_ for the file will exist in every ISISDATA install, or even be the same file. But this is a good thing because it **validates** the local ISISDATA installation and test environment. -# ISISDATA Mockup Tool - isisdata_mockup.py +# ISISDATA Mockup Tool - isisdata_mockup -A tool has been provided with the `isisdataeval` application that generates ISISDATA mockups for this and potentially other purposes. This Python tool will convert an ISISDATA installation into a mockup that is suitable to test the database lookup system and verify the contents and structure of mission kernel configurations in $ISISDATA. However, it has been generalized to use for any directory structure. Here is the application documentation: - -`isisdata_mockup.py -h` +A tool has been provided with the `isisdataeval` application that generates ISISDATA mockups for this and potentially other uses. This Python tool will convert an ISISDATA installation into a mockup that is suitable to test the database lookup system and **verify** the contents and structure of mission kernel configurations in $ISISDATA. However, it has been generalized to use for any directory structure. The application help documentation describes the parameter and behavioir of the script. Use the command `isisdata_mockup -h` to produce the documentation. +In general, if you want to produce a complete inventory of the ISISDATA directory, the form to use is: ``` -usage: isisdata_mockup.py [-h] --isisdata ISISDATA --outpath OUTPATH [--ghostdir GHOSTDIR] - [--saveconfig] [--dryrun] [--verbose] - -Convert ISISDATA directory to test format - -optional arguments: - -h, --help show this help message and exit - --isisdata ISISDATA ISISDATA directory root - --outpath OUTPATH NEW ISISDATA directory to create - --ghostdir GHOSTDIR Replaces the --isisdata path with this string to ghost the actual input - directory path - --saveconfig, -s Retain *.db, *.conf files rather than replace with processs info - --dryrun, -n Only print actions but do not execute - --verbose, -v Verbose output - -isisdata_mockup.py will create a copy of all the files found in the directory specified by -the --isisdata parameter. All files and directories contained in --isisdata will -be copied to the --outpath destination, which specifies a new directory. This -directory will be created if it does not exist. - -All *.db and *.conf files are copied as is, fully intact. Also needed to use this -mock up of ISISDATA is the NAIF LSK. This LSK kernel is always loaded by the iTime -class to concert UTC and Et times. Otherwise, all other files encountered will -be created but the contents are replaced by information regarding the --isisdata -source file. This information includes the original file opath, creation and -modification dates in UTC format, the size of the file and its hash values. - -This app sets up a full data directory structure that mimics the contents of -ISISDATA (or any directory for that matter). The size of the file, its creation -and last modification dates and the md5, sha1 and sha256 hash values are created -in a dict which is then written to the new output file if its not a special ISIS -kernel db/conf file or LSK. - -Finally, by setting --ghostdir '$ISISDATA', this now provides a connection to -the source file in the ISISDATA directory. This can be used to compare hash values -compute by other sources. - -Example: - -To provide a full mock up of ISISDATA directory: - -isisdata_mockup.py --saveconfig --isisdata /opt/isis/data --outpath $PWD/mockisisdata --ghostdir '$ISISDATA' - -Author: Kris J. Becker, University of Arizona - kbecker@orex.lpl.arizona.edu - -History 2023-03-15 Kris Becker - Original verison +isisdata_mockup --saveconfig --hasher all --isisdata /opt/isis/data --outpath $PWD/mockisisdata --ghostdir '$ISISDATA' --tojson isisdata_complete_mockup.json ``` You can also choose to process only a single directory, as shown in the following example: - ``` -isisdata_mockup.py --saveconfig --isisdata /opt/isis/data/voyager1 --outpath $PWD/mockisisdata/voyager1 --ghostdir '$ISISDATA/voyager1' +isisdata_mockup --saveconfig --hasher all --isisdata /opt/isis/data/voyager1 --outpath $PWD/mockisisdata/voyager1 --ghostdir '$ISISDATA/voyager1' --tojson isisdata_voyager_mockup.json ``` -Processing times can be significant since this script is computing 3 different hash values per file. +Processing times can be significant since these examples are computing 3 different hash values per file. For this purpose, the `md5` hash algorithm is probably sufficient for ISISDATA file comparisons. # ISISDATA Mockup Test Data Preparation -With this tool and the files listed in `isisdataeval_isisdata_mockup_files.lis`, the test for isisdataeval can be recreated from any ISISDATA installation. Note that from time to time, the files used in this ISISDATA test could change which would cause failures. Here are the commands to create the `isisdataeval` test ISISDATA mockup directory - from the Git install directory: +With this tool and the files listed in `isisdataeval_isisdata_mockup_files.lis`, the test for isisdataeval can be recreated from any ISISDATA installation. Note that from time to time, the files used in this ISISDATA test could change which would cause failures. In this situation, maintainers will need to regenerate the ISISDATA mockup if the change is expected/needed. Here are the commands to create the `isisdataeval` test ISISDATA mockup directory originating from the Git install directory: ``` -# Create the test data area for isisdata -cd ISIS3/isis/test/isisdata +# From the Git install directory, create the test data mockup in isisdata +cd ISIS3/isis/test/data/isisdata mkdir -p mockup mockprocessing -# Produce the mockup data. Its assumed isisdata_mockup.py is in a runtime path. -isisdata_mockup.py --saveconfig --isisdata $ISISDATA/base --outpath mockprocessing/isisdatamockup/base --ghostdir '$ISISDATA/base' -isisdata_mockup.py --saveconfig --isisdata $ISISDATA/hayabusa --outpath mockprocessing/isisdatamockup/hayabusa --ghostdir '$ISISDATA/hayabusa' -isisdata_mockup.py --saveconfig --isisdata $ISISDATA/smart1 --outpath mockprocessing/isisdatamockup/smart1 --ghostdir '$ISISDATA/smart1' -isisdata_mockup.py --saveconfig --isisdata $ISISDATA/voyager1 --outpath mockprocessing/isisdatamockup/voyager1 --ghostdir '$ISISDATA/voyager1' +# Produce the mockup data. Its assumed isisdata_mockup is in a runtime path. +isisdata_mockup --saveconfig --hasher all --isisdata $ISISDATA/base --outpath mockprocessing/isisdatamockup/base --ghostdir '$ISISDATA/base' --tojson isisdata_mockup_base.json +isisdata_mockup --saveconfig --hasher all --isisdata $ISISDATA/hayabusa --outpath mockprocessing/isisdatamockup/hayabusa --ghostdir '$ISISDATA/hayabusa' --tojson isisdata_mockup_hayabusa.json +isisdata_mockup --saveconfig --hasher all --isisdata $ISISDATA/smart1 --outpath mockprocessing/isisdatamockup/smart1 --ghostdir '$ISISDATA/smart1' --tojson isisdata_mockup_smart1.json +isisdata_mockup --saveconfig --hasher all --isisdata $ISISDATA/voyager1 --outpath mockprocessing/isisdatamockup/voyager1 --ghostdir '$ISISDATA/voyager1' --tojson isisdata_mockup_voyager1.json # Copy/install the desired files for the test -rsync -av --files-from=isisdataeval_isisdata_mockup_files.lis mockprocessing/isisdatamockup/ mockup/ +rsync -av --files-from=isisdataeval_isisdata_mockup_files.lis mockprocessing/isisdatamockup/ mockup/ /bin/rm -rf mockprocessing -# Run an inventory test for the mockup +# Run an inventory on the local ISISDATA mockup test data isisdataeval isisdata=mockup datadir=mockup toinventory=isisdata_mockup_inventory.csv toissues=isisdata_mockup_issues.csv toerrors=isisdata_mockup_errors.csv hash=md5 +``` + +# Other products +The `isisdata_mockup` application can also produce a summary procesing log file of the run used to create a mockup. It can also be ran standalone on any system to provide as an comparison dataset for any mockup. This command produces a summary of the full ISISDATA installation without producing the mockup in `--outpath` (it is now an optional parameter), computes only the md5 hash and writes the entire contents to the JSON output file specified in `--tojson`: + +``` +isisdata_mockup -v --hasher=md5 --isisdata /opt/isis3/data --ghostdir '$ISISDATA' --tojson isisdata_full_md5.json +``` +This example provides the results of a much smaller directory, $ISISDATA/base/dems, that also computes the md5 hash, including all `*.db` and `*.conf` files, does not generate, unless given, any absolute paths, and **does not** produce the mockup directory: +``` +isisdata_mockup --hasher md5 --isisdata /opt/isis3/data/base/dems --ghostdir '$ISISDATA/base/dems' --tojson isisdata_base_dem_md5.json +``` + +And here is portion of the contents of the output JSON file, `isisdata_base_dem_md5.json`: +``` +{ + "program": { + "name": "isisdata_mockup", + "version": "0.2", + "date": "2023-03-29", + "runtime": "2023-03-29T21:02:03.721127 UTC", + "endtime": "2023-03-29T21:02:39.887436 UTC", + "elapsedtime": "0:00:36.166309", + "parameters": { + "isisdata": "/opt/isis3/data/base/dems", + "outpath": null, + "ghostdir": "$ISISDATA/base/dems", + "hasher": [ + "md5" + ], + "saveconfig": false, + "tojson": "isisdata_base_dem_md5.json", + "dryrun": false, + "verbose": false + } + }, + "inventory": { + "hashlist": [ + "md5" + ], + "missing_count": 0, + "files_count": 33, + "missing": [], + "files": [ + { + "source": "$ISISDATA/base/dems/Ceres_Dawn_FC_HAMO_DTM_DLR_Global_60ppd_Oct2016_prep.cub", + "filesize": 467404363, + "createtime": "2016-10-17T21:20:18.000000", + "modifiedtime": "2020-02-13T16:33:27.681881", + "md5hash": "e8e8763630d938caf9cfc52b2820f35b" + }, + { + "source": "$ISISDATA/base/dems/LRO_LOLA_LDEM_global_128ppd_20100915.cub", + "filesize": 2141164162, + "createtime": "2011-03-19T19:18:34.000000", + "modifiedtime": "2020-02-13T16:36:48.776844", + "md5hash": "cb36cb569671f4db0badc29d737ed6d5" + }, + { + ...more file descriptions follow... + } + + ] + } +} ``` +Note this file can then be shared and compared using the same runtime parameters on other platforms. I would refrain from comparing times, but all the other data (i.e., source, filesize, and hash values) can be directly compared for equivalence to help identify any differences from, say, a reputable ISISDATA source. # Running ISISDATA Mockup Tests -Once the test data is installed, the contents ./ISIS3/isis/tests/data/isisisdata/mockup will contain the test data created above. To explicitly run the `isisdataeval` tests, use `ctest -R IsisData`. If an error occurs, rerun with the command `ctest --rerun-failed --output-on-failure` to get specific information about which tests failed and why. +Once the test data is installed, the contents of ./ISIS3/isis/tests/data/isisisdata/mockup will contain the test data created above. To explicitly run the `isisdataeval` tests, from the build directory, use `ctest -R IsisData`. If an error occurs, rerun with the command `ctest --rerun-failed --output-on-failure` to get specific information about which tests failed and why. -Note that during this test, the contents of files with less than 102400 bytes will have all three hash values compute in the tests and compared with the values stored in hash keywords, _md5hash_, _sha1hash_ and _sha256hash_ for an external validation hashing. Note also that the all files $ISISDATA/base/kernels/lsk are retained since Isis::iTime requires an LSK and it always loads one from $ISISDATA/base/lsk/naif????.tls. +Note that during this test, the contents of files with less than 102,400 bytes will have all three hash values compute in the tests and compared with the values stored in hash keywords, _md5hash_, _sha1hash_ and _sha256hash_ for an external validation of hashing algorithms. Note also that all files in $ISISDATA/base/kernels/lsk are retained since Isis::iTime requires an LSK and it always loads one from $ISISDATA/base/lsk/naif????.tls. diff --git a/isis/src/system/apps/isisdataeval/tools/isisdata_mockup.py b/isis/src/system/apps/isisdataeval/tools/isisdata_mockup similarity index 60% rename from isis/src/system/apps/isisdataeval/tools/isisdata_mockup.py rename to isis/src/system/apps/isisdataeval/tools/isisdata_mockup index ff7018dab0..8c865226e1 100755 --- a/isis/src/system/apps/isisdataeval/tools/isisdata_mockup.py +++ b/isis/src/system/apps/isisdataeval/tools/isisdata_mockup @@ -55,6 +55,11 @@ class to concert UTC and Et times. Otherwise, all other files encountered will source file. This information includes the original file opath, creation and modification dates in UTC format, the size of the file and its hash values. +Users can select a specific set of hash algorithms using the --hasher parameter. +Note that more than one hash algorithm can be specified on the command line by +simply provide multiple --hasher values (e.g., --hasher=md5 --hasher=sha1) +or, to compute all available, use --hasher=all. + This app sets up a full data directory structure that mimics the contents of ISISDATA (or any directory for that matter). The size of the file, its creation and last modification dates and the md5, sha1 and sha256 hash values are created @@ -65,17 +70,24 @@ class to concert UTC and Et times. Otherwise, all other files encountered will the source file in the ISISDATA directory. This can be used to compare hash values compute by other sources. +If a complete summary of processing log is desired, provide a file name in +--tojson. This file will contain a JSON structure with the contents of program +details and file analysis. Note that is particularly useful if you just want the +data and not the mockup replication. Just add --dryrun to produce the log file +only. + Example: To provide a full mock up of ISISDATA directory: -%(prog)s --saveconfig --isisdata /opt/isis4/data --outpath $PWD/mockisisdata --ghostdir '$ISISDATA' +%(prog)s --saveconfig --hasher all --isisdata /opt/isis4/data --outpath $PWD/isisdatamock --ghostdir '$ISISDATA' --tojson isisdata_mockup_full.json Author: Kris J. Becker, University of Arizona kbecker@orex.lpl.arizona.edu -History 2023-03-15 Kris Becker - Original verison +History 2023-03-15 Kris Becker Original verison + 2023-03-29 Kris Becker Added --tojson and --hasher parameters to improve utility ''' # Set to None for notebook mode or True for applicaton is_app = True @@ -89,29 +101,37 @@ class to concert UTC and Et times. Otherwise, all other files encountered will parser.add_argument('--isisdata', help="ISISDATA directory root", required=True, action='store', default=None) parser.add_argument('--outpath', help="NEW ISISDATA directory to create", - required=True, action='store', default=None) - parser.add_argument('--ghostdir', - help="Replaces the --isisdata path with this string to ghost the actual input directory path ($ISISDATA is highly recommended)", required=False, action='store', default=None) - parser.add_argument('--saveconfig','-s', - help='Retain *.db, *.conf files rather than replace with processs inf (for isisdataeval testing)', + parser.add_argument('--ghostdir', help="Replaces the --isisdata path with this string to ghost the actual input directory path", + required=False, action='store', default=None) + parser.add_argument('--hasher', help="Add desired/supported hash algorithms to compute", + action='append', choices=['md5', 'sha1', 'sha256', 'all'], + default=None) + parser.add_argument('--saveconfig','-s',help='Retain *.db, *.conf files rather than replace with processs info', action='store_true') + parser.add_argument('--tojson', + help="Write all ISISDATA file information to output file in JSON format", + required=False, action='store', default=None) - parser.add_argument('--dryrun','-n',help='Only print actions but do not execute', action='store_true') + parser.add_argument('--dryrun','-n',help='Only print --outpath actions but do not execute', action='store_true') parser.add_argument('--verbose','-v',help='Verbose output', action='store_true') args = parser.parse_args() else: - # This is ran when ( is_app is None ) - isisdata = '/opt/isis/data/base/dems' - outpath = '/tmp/MOCKISISDATA/base/dems' - ghostdir = '$ISISDATA/base/dems' + + isisdata = '/opt/isis3/data/base/dems' + hasher = None # or ['all'], ['md5'], ['md5', 'sha1', 'md5', 'sha1', 'sha1'] + outpath = None + ghostdir = '$ISISDATA/base/dems' + tojson = 'isisdata_dem_data_none.json' saveconfig = True dryrun = True verbose = True args = argparse.Namespace(isisdata=isisdata, outpath=outpath, ghostdir=ghostdir, - saveconfig=saveconfig, dryrun=dryrun, verbose=verbose) + hasher=hasher, saveconfig=saveconfig, tojson=tojson, + dryrun=dryrun, verbose=verbose) + return args @@ -140,17 +160,19 @@ def compute_hash( filepath, method='md5', dryrun=False, verbose=False, **kwargs) Returns the hex value of the compute hash for the file """ if 'sha256' in method: - hasher = hashlib.sha256() + myhasher = hashlib.sha256() elif 'sha1' in method: - hasher = hashlib.sha1() + myhasher = hashlib.sha1() + elif 'md5' in method: + myhasher = hashlib.md5() else: - hasher = hashlib.md5() + raise RuntimeError("Unsupported/invalid hash algorithm: " + method) with open( filepath, "rb" ) as fb: for data in iter( lambda: fb.read(4096), b""): - hasher.update( data ) + myhasher.update( data ) - return hasher.hexdigest() + return myhasher.hexdigest() @@ -191,7 +213,6 @@ def preserve_file_contents( filepath, **kwargs ): Returns True if the conditions to preserve the contents is met. Otherwise, returns False. """ - if filepath.suffix == '.db': return True if filepath.suffix == '.conf': if 'kernels' in filepath.as_posix(): return True @@ -222,24 +243,58 @@ def main(): None """ + # Program details + program_name = 'isisdata_mockup' + program_version = "0.2" + program_date = "2023-03-29" + program_runtime = datetime.datetime.now(datetime.timezone.utc) + program_runtime_str = program_runtime.strftime("%Y-%m-%dT%H:%M:%S.%f %Z") + + j_progdata = OrderedDict() + j_progdata['name'] = program_name + j_progdata['version'] = program_version + j_progdata['date'] = program_date + j_progdata['runtime'] = program_runtime_str + # Get the application parameters as provided by user args = parse_arguments() kwargs = vars(args) + j_parameters = OrderedDict( kwargs ) + # Consolidate some parameters verbose = args.verbose dryrun = args.dryrun report = verbose or dryrun isisdatadir = args.isisdata + hashers = args.hasher ghostdir = args.ghostdir outpath = args.outpath saveconfig = args.saveconfig + # Single test for generating output mockup + do_mockup = (not dryrun) and ( outpath is not None ) + if outpath is None: + outpath = '/tmp/isisdatamockingjunk' + + # Set up hashing conditions + # Selected hashers are log to --tojson + if hashers is None: + hashlist = [ ] + elif 'all' in hashers: + hashlist = [ 'md5', 'sha1', 'sha256' ] + else: + # make sure the list is unique + hashlist = list(set(hashers)) + + allfiles = sorted( pathlib.Path( isisdatadir ).glob('**/*') ) if report: print("\nTotalFilesDirsFound: ", len(allfiles) ) - missing = [ ] + missinglist = [ ] + directories = [ ] + filelist_j = [ ] for fpath in allfiles: if report: print("\n*** Processing: ", fpath.as_posix() ) @@ -253,7 +308,7 @@ def main(): if isisdatapos != 0: if report: print("FileNotInIsisDataIgnored: ", fpath.as_posix() ) - missing.append( fpath.as_posix() ) + missinglist.append( fpath.as_posix() ) else: jsondata = OrderedDict() @@ -263,7 +318,8 @@ def main(): # jsondata['source'] = newisisdata.as_posix() if fpath.is_dir(): - if not dryrun: newisisdata.mkdir( parents=True ) + if do_mockup: newisisdata.mkdir( parents=True ) + directories.append( isisfile ) else: # its a real file outdir = pathlib.Path( newisisdata.parent ) @@ -272,43 +328,67 @@ def main(): create_ts = os.path.getctime( fpath.as_posix() ) modified_ts = finfo.st_mtime - # This is actually needed on some systems if modified_ts < create_ts: create_ts, modified_ts = modified_ts, create_ts - # This one does not work on Linux systems (gotta be root to get the correct datetime!) + # This one does not work on Linux systems (gotta be root to get the correct value!) #createtime = datetime.datetime.fromtimestamp( finfo.st_birthtime, tz=datetime.timezone.utc ).strftime('%Y-%m-%dT%H:%M:%S.%f') createtime = datetime.datetime.fromtimestamp( create_ts, tz=datetime.timezone.utc ).strftime('%Y-%m-%dT%H:%M:%S.%f') modifiedtime = datetime.datetime.fromtimestamp( modified_ts, tz=datetime.timezone.utc ).strftime('%Y-%m-%dT%H:%M:%S.%f') - # Compute the file hashes - md5hash = compute_hash( fpath.as_posix(), method='md5', **kwargs) - sha256hash = compute_hash( fpath.as_posix(), method='sha256', **kwargs) - sha1hash = compute_hash( fpath.as_posix(), method='sha1', **kwargs) - - # The rest of the data for this file jsondata["filesize"] = finfo.st_size jsondata["createtime"] = createtime jsondata["modifiedtime"] = modifiedtime - jsondata["md5hash"] = md5hash - jsondata["sha256hash"] = sha256hash - jsondata["sha1hash"] = sha1hash + + # Hashing... + for hashmethod in hashlist: + hash_name = hashmethod + 'hash' + jsondata[hash_name] = compute_hash( fpath.as_posix(), method=hashmethod, **kwargs) + + # Add to inventory + filelist_j.append( jsondata ) if report: print( json.dumps( jsondata, indent=4 ) ) if not outdir.exists(): if report: print("CreatingDir: ", outdir.as_posix() ) - if not dryrun: outdir.mkdir(parents=True) + if do_mockup: outdir.mkdir(parents=True) # There are also *.conf in $ISISDATA/mro/calibration so must exclude copying of those files - # isisdataeval requires that the LSK be valid (for time conversions) so preserve those kernels - if saveconfig and preserve_file_contents( fpath, **kwargs): + # isisdataeval requires that the LSK be valid (for time conversions) so preserve these kernels + if saveconfig and preserve_file_contents( fpath, **kwargs) : if report: print("CopyTo: ", newisisdata.as_posix() ) - if not dryrun: shutil.copy( fpath.as_posix(), newisisdata.as_posix() ) + if do_mockup: shutil.copy( fpath.as_posix(), newisisdata.as_posix() ) else: if report: print("WriteDataTo: ", newisisdata.as_posix() ) - if not dryrun: newisisdata.write_text( json.dumps( jsondata, indent=4 ) ) + if do_mockup: newisisdata.write_text( json.dumps( jsondata, indent=4 ) ) + + + # Program timing + program_endtime = datetime.datetime.now(datetime.timezone.utc) + program_elapsed_time = str(program_endtime - program_runtime) + + j_progdata['endtime'] = program_endtime.strftime("%Y-%m-%dT%H:%M:%S.%f %Z") + j_progdata['elapsedtime'] = program_elapsed_time + + # Write data if requested + if args.tojson is not None: + # Create output JSON structure + j_datalog = OrderedDict() + + j_datalog['program'] = j_progdata + j_datalog['program']['parameters'] = j_parameters + + j_datalog['inventory'] = OrderedDict( { 'hashlist' : hashlist, + 'missing_count': len(missinglist), + 'files_count' : len(filelist_j), + 'missing' : missinglist, + 'files' : filelist_j } ) + + # Write the config results + with open (args.tojson, "w") as json_data_file: + json.dump(j_datalog, json_data_file, indent=2) # In[ ]: diff --git a/isis/tests/data/isisdata/make_isisdata_mockup.sh b/isis/tests/data/isisdata/make_isisdata_mockup.sh index cf8d616753..5ef7e54ce9 100755 --- a/isis/tests/data/isisdata/make_isisdata_mockup.sh +++ b/isis/tests/data/isisdata/make_isisdata_mockup.sh @@ -1,14 +1,14 @@ #!/bin/sh # Create the test data area for isisdata -# cd ISIS3/isis/test/isisdata +# cd ISIS3/isis/test/data/isisdata mkdir -p mockup mockprocessing -# Produce the mockup data. Its assumed isisdata_mockup.py is in a runtime path. -isisdata_mockup.py --saveconfig --isisdata $ISISDATA/base --outpath mockprocessing/isisdatamockup/base --ghostdir '$ISISDATA/base' -isisdata_mockup.py --saveconfig --isisdata $ISISDATA/hayabusa --outpath mockprocessing/isisdatamockup/hayabusa --ghostdir '$ISISDATA/hayabusa' -isisdata_mockup.py --saveconfig --isisdata $ISISDATA/smart1 --outpath mockprocessing/isisdatamockup/smart1 --ghostdir '$ISISDATA/smart1' -isisdata_mockup.py --saveconfig --isisdata $ISISDATA/voyager1 --outpath mockprocessing/isisdatamockup/voyager1 --ghostdir '$ISISDATA/voyager1' +# Produce the mockup data. Its assumed isisdata_mockup is in a runtime path. +isisdata_mockup --saveconfig --hasher all --isisdata $ISISDATA/base --outpath mockprocessing/isisdatamockup/base --ghostdir '$ISISDATA/base' --tojson isisdata_mockup_base.json +isisdata_mockup --saveconfig --hasher all --isisdata $ISISDATA/hayabusa --outpath mockprocessing/isisdatamockup/hayabusa --ghostdir '$ISISDATA/hayabusa' --tojson isisdata_mockup_hayabusa.json +isisdata_mockup --saveconfig --hasher all --isisdata $ISISDATA/smart1 --outpath mockprocessing/isisdatamockup/smart1 --ghostdir '$ISISDATA/smart1' --tojson isisdata_mockup_smart1.json +isisdata_mockup --saveconfig --hasher all --isisdata $ISISDATA/voyager1 --outpath mockprocessing/isisdatamockup/voyager1 --ghostdir '$ISISDATA/voyager1' --tojson isisdata_mockup_voyager1.json # Copy/install the desired files for the test rsync -av --files-from=isisdataeval_isisdata_mockup_files.lis mockprocessing/isisdatamockup/ mockup/