Is there a way to check if inputs exist before running METplus? #2167
-
Hello, Given a METplus config, which specifies one or more FCST and OBS input templates, and uses pattern matching for init/valid/lead times, is there a way to check that all the input files exist prior to running? This might be the equivalent to running a tool in "dummy mode". The basic use case I'm trying to solve is this:
So far, the best solution I've concocted is to turn off all the outputs and run the job. If the inputs are missing it fails pretty quickly, and the log file lists everything it couldn't find. However, this can still take a while to finish executing if partial data is available, and there is the added messiness of handlling which parts of the job need to be re-run. I realise Python embedding would be one way to solve this. However, I'm hoping to avoid the dev work required for that solution (for now anyway). Also, from a design point of view, I'd like to avoid metplus runs kicking off calls to external services and pulling data down, when this can be managed more explicitly and transparently by other applications. Any suggestions appreciated. Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Hi @John-Sharples, The short answer is no, the METplus wrappers do not have any mechanism built-in to check if all of the expected inputs exist before running. There exists a METplus config variable called DO_NOT_RUN_EXE that will try to build commands but not actually run them. You could use this to speed up the run since the actual MET tools are not being called. The METplus run should still fail if any of the expected files are not found. However, if your use case runs more than one process and a downstream process reads output that is created by a earlier process, then the use case will likely always fail. I'm not sure if there is an easy catch-all solution that could be implemented in the METplus wrappers to check for the existence of input files. The logic to check if a given *_INPUT_DIR/TEMPLATE is actually an input file or an intermediate file would be complex and difficult to catch every case. I was going to suggest adding a UserScript instance to the beginning of the PROCESS_LIST that calls a script to check for the existence of input files and obtains them if they are not found locally. It sounds like you are trying to avoid doing something like that when you said:
Could you explain more what you mean by that? Maybe I am not fully understanding your question/issue. |
Beta Was this translation helpful? Give feedback.
-
Thanks for you comments @georgemccabe, much appreciated. DO_NOT_RUN_EXE looks like a handy flag for this kind of task. Thanks also for the suggestion about a UserScript, I confess I didn't know about this functionality.
Sure thing! The design principle of Separation of Concerns says we should have separate applications, each with a dedicated purpose, and avoid overlapping functionality. In this case, an application that does verification (METplus), and some other process that manages data transfer/storage, (and others for databases, job queues/scheduling, etc). Hence, trying to separate the data management from the execution of METplus. I appreciate that METplus wasn't designed with this sort of functionality in mind, and once you introduce multiple MET processes, it becomes very convoluted. Although, UserScripts might prove to be a pretty good middle ground. |
Beta Was this translation helpful? Give feedback.
Hi @John-Sharples,
The short answer is no, the METplus wrappers do not have any mechanism built-in to check if all of the expected inputs exist before running.
There exists a METplus config variable called DO_NOT_RUN_EXE that will try to build commands but not actually run them. You could use this to speed up the run since the actual MET tools are not being called. The METplus run should still fail if any of the expected files are not found. However, if your use case runs more than one process and a downstream process reads output that is created by a earlier process, then the use case will likely always fail.
I'm not sure if there is an easy catch-all solution that could be implemented i…