Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace run config .py files with .yaml #483

Merged
merged 15 commits into from
Apr 23, 2021
Merged

replace run config .py files with .yaml #483

merged 15 commits into from
Apr 23, 2021

Conversation

marxide
Copy link
Contributor

@marxide marxide commented Apr 1, 2021

Changes the pipeline run configuration file format to YAML.
Implements a strict YAML validator.
Closes #479.

To do:

  • Change configuration to YAML
  • Update tests
  • Rebase on docs-update-adam
  • Update docs
  • Update CHANGELOG.md

Changes the pipeline run configuration file format to YAML.
Implements a strict YAML validator.
Closes #479.
marxide added 5 commits April 1, 2021 14:03
…ig-yaml

* commit '99da822d6082fe0d7700ad2cfbc7e326ed90a485': (21 commits)
  Added docstrings to translators and converters
  Updated adminusage/cli.md
  Manually commit monitor.md suggestion
  Apply suggestions from code review
  Apply suggestions from code review
  Updated urls and admin docstrings.
  Fix quickstart usage links
  Moved Mathjax to `docs/theme/js/extra.js`
  Apply suggestions from code review
  Added a code reference section to the documentation
  Fix internal page links
  Update CHANGELOG.md
  Finished website section
  Added new navbar options: docs and start a discussion
  Completed output section
  Added pipeline outputs page.
  Added source stats page.
  Added new source and monitoring pages
  Added near complete association page
  Added ingest page.
  ...
* origin/master:
  Bump Browsersync from 2.26.13 to 2.26.14.
  Update CHANGELOG.md
  Bump prismjs from 1.22.0 to 1.23.0
@marxide marxide marked this pull request as ready for review April 2, 2021 22:38
@marxide marxide requested a review from ajstewart April 2, 2021 22:38
@ajstewart
Copy link
Contributor

@marxide I think the validation errors may need some clarification. Say if an image was missing, previously the error would give you a good clue as to what was wrong, for example, if there is a missing background image, see the error messages below.

With YAML the error seems a little cryptic for someone not familiar with the code, we can tell what's wrong but I don't think the average user will.

old

Pipeline error: Number of BACKGROUND_FILES different from number of IMAGE_FILES files.
Traceback (most recent call last):
  File "/Users/adam/GitHub/vast-pipeline/vast_pipeline/management/commands/runpipeline.py", line 103, in run_pipe
    pipeline.validate_cfg(user=user)
  File "/Users/adam/GitHub/vast-pipeline/vast_pipeline/pipeline/main.py", line 349, in validate_cfg
    'Number of BACKGROUND_FILES different from number of'
vast_pipeline.pipeline.errors.PipelineConfigError: Pipeline error: Number of BACKGROUND_FILES different from number of IMAGE_FILES files.
CommandError: Config error:
Pipeline error: Number of BACKGROUND_FILES different from number of IMAGE_FILES files.

YAML:

Pipeline error: while parsing a mapping
  in "run config", line 93, column 1:
      background:
    ^ (line: 93)
required key(s) '14' not found
  in "run config", line 119, column 1:
        - /Users/adam/askap/pipeline ...
    ^ (line: 119).
Traceback (most recent call last):
  File "/Users/adam/GitHub/vast-pipeline/vast_pipeline/pipeline/config.py", line 262, in validate
    for epoch in epochs
  File "/Users/adam/anaconda3/envs/vast-pipeline-dev/lib/python3.7/site-packages/strictyaml/representation.py", line 110, in revalidate
    result = schema(self._chunk)
  File "/Users/adam/anaconda3/envs/vast-pipeline-dev/lib/python3.7/site-packages/strictyaml/validators.py", line 17, in __call__
    self.validate(chunk)
  File "/Users/adam/anaconda3/envs/vast-pipeline-dev/lib/python3.7/site-packages/strictyaml/compound.py", line 203, in validate
    sorted(list(set(self._required_keys).difference(found_keys)))
  File "/Users/adam/anaconda3/envs/vast-pipeline-dev/lib/python3.7/site-packages/strictyaml/yamllocation.py", line 51, in while_parsing_found
    self.expecting_but_found("while parsing {0}".format(what), found=found)
  File "/Users/adam/anaconda3/envs/vast-pipeline-dev/lib/python3.7/site-packages/strictyaml/yamllocation.py", line 47, in expecting_but_found
    self,
strictyaml.exceptions.YAMLValidationError: while parsing a mapping
  in "run config", line 93, column 1:
      background:
    ^ (line: 93)
required key(s) '14' not found
  in "run config", line 119, column 1:
        - /Users/adam/askap/pipeline ...
    ^ (line: 119)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/adam/GitHub/vast-pipeline/vast_pipeline/management/commands/runpipeline.py", line 105, in run_pipe
    pipeline.config.validate(user=user)
  File "/Users/adam/GitHub/vast-pipeline/vast_pipeline/pipeline/config.py", line 267, in validate
    raise PipelineConfigError(e)
vast_pipeline.pipeline.errors.PipelineConfigError: Pipeline error: while parsing a mapping
  in "run config", line 93, column 1:
      background:
    ^ (line: 93)
required key(s) '14' not found
  in "run config", line 119, column 1:
        - /Users/adam/askap/pipeline ...
    ^ (line: 119).
CommandError: Config error:
Pipeline error: while parsing a mapping
  in "run config", line 93, column 1:
      background:
    ^ (line: 93)
required key(s) '14' not found
  in "run config", line 119, column 1:
        - /Users/adam/askap/pipeline ...
    ^ (line: 119).

Screen Shot 2021-04-08 at 19 05 47

@ajstewart ajstewart added documentation Improvements or additions to documentation enhancement New feature or request labels Apr 8, 2021
Copy link
Contributor

@ajstewart ajstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor points alongside the config validation messages I posted about earlier, I do think they have become too cryptic.

It's very nice to have a defined type of the config, that was causing me a lot of pain in the docstrings!

Thanks for doing it, from testing it locally it all seems to be working as expected through the UI and CLI.

Comment on lines 138 to 140
# Determine if epoch-based association should be used based on input files.
# If inputs are dicts, then the user has defined their own epochs. If inputs are
# lists, we must convert to dicts and auto-fill the epochs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment might need updating to reflect better what's going on now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marxide do you think this bit still needs changing to remove references to the input dictionaries etc?

@marxide
Copy link
Contributor Author

marxide commented Apr 19, 2021

@marxide I think the validation errors may need some clarification.

I agree in that case it is cryptic. I've updated the error messages for that particular validation check and some others in b4b9e60. The standard schema checks will still have the original error messages, but I don't think those ones are very cryptic. Let me know if you come across one that is and we can look at catching it.

@marxide marxide requested a review from ajstewart April 19, 2021 21:26
@ajstewart
Copy link
Contributor

@marxide I think the validation errors may need some clarification.

I agree in that case it is cryptic. I've updated the error messages for that particular validation check and some others in b4b9e60. The standard schema checks will still have the original error messages, but I don't think those ones are very cryptic. Let me know if you come across one that is and we can look at catching it.

@marxide is there anyway to make the errors even more specific? For example if I remove a background image the error is now (which is of course better!):

Error in config validation:
Pipeline error: Number of epochs per input type do not match..
pipeline_config.validate(user=request.user)
File "/Users/adam/GitHub/vast-pipeline/vast_pipeline/pipeline/config.py", line 261, in validate
raise PipelineConfigError("Number of epochs per input type do not match.")
vast_pipeline.pipeline.errors.PipelineConfigError: Pipeline error: Number of epochs per input type do not match..

But before it would point state clearly that the number of background images does not match the images so the user knows exactly what to fix (note the double full stop there from somewhere, I copied it from the website). It could be an extra bit of info attached to this error message which I think is a good general one.

On a different point, I've noticed that when you create a run through the website without specifying any background files and monitor off the background heading is missing in the config:

  image:
  # list input images here, e.g. (note the leading hyphens)
  # - /path/to/image1.fits
  # - /path/to/image2.fits

  selavy:
  # list input selavy catalogues here, as above with the images

  noise:
  # list input noise (rms) images here, as above with the images

  # Required only if source_monitoring.monitor is true, otherwise optional. If not providing
  # background images, remove the entire background section below.

source_monitoring:

is it possible to make this appear regardless? As if a user went back to add background images they may not know what heading to enter as it's not there. The comment is also referring to a section that isn't present in this example.

marxide added 2 commits April 22, 2021 16:39
Run config template will now always contain the background section. If source monitoring is off, it will be commented out.
@marxide
Copy link
Contributor Author

marxide commented Apr 23, 2021

Changed the error messages to be more specific. I thought it would be best to explicitly list the epoch or file counts when there is an error. Some examples below.

When there is a background file missing:

Error in config validation
Pipeline error: The number of files must match for all input types.
image has 2 files
selavy has 2 files
noise has 2 files
background has 1 file.

When there is a Selavy file missing for one of the epochs in epoch-mode:

Error in config validation
Pipeline error: The number of files per epoch does not match between input types.
image:
ep1: 3
ep2: 2
ep3: 2
selavy:
ep1: 3
ep2: 1
ep3: 2
noise:
ep1: 3
ep2: 2
ep3: 2.

When one of the Selavy epochs doesn't match the others in epoch-mode:

Error in config validation
Pipeline error: The name of the epochs must match for all input types.
image has 3 epochs: ep1, ep2, ep3
selavy has 3 epochs: ep1, ep2, ep4
noise has 3 epochs: ep1, ep2, ep3.

The background section will now always be printed to the config file when create via the website. It will be commented out when source monitoring is turned off.

Copy link
Contributor

@ajstewart ajstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing those points, it all looks great!

I think this is now good to go (don't forget to change the merge branch to dev).

@marxide marxide changed the base branch from master to dev April 23, 2021 18:08
@marxide marxide merged commit 162c49d into dev Apr 23, 2021
@marxide marxide deleted the run-config-yaml branch April 23, 2021 18:16
marxide added a commit that referenced this pull request May 21, 2021
* Bump y18n from 3.2.1 to 3.2.2

Bumps [y18n](https://github.com/yargs/y18n) from 3.2.1 to 3.2.2.
- [Release notes](https://github.com/yargs/y18n/releases)
- [Changelog](https://github.com/yargs/y18n/blob/master/CHANGELOG.md)
- [Commits](https://github.com/yargs/y18n/commits)

Signed-off-by: dependabot[bot] <[email protected]>

* Add epoch based association clarification to documentation

* .env file text and dark mode

- Added dark mode to docs.
- Added text about the .env file to documentation.
- Clarified and made clearer the Django Q requirement for running jobs.

* Changed VAST logo to icon format to not appear stretched

* Apply suggestions from code review

Co-authored-by: Andrew O'Brien <[email protected]>

* Bump minimum mkdocs-material to 7.1.1 to ensure dark mode functionality

* Updated CHANGELOG.md

* Fixed CHANGELOG.md

* Updated the docs layout, home page and FAQs

- Enabled a new home page with wave design and announcement bar.
- Changed layout to tabs and sections mode.
- Changed location of some pages to other tab headings.
- Added to FAQ section.

* Updated CHANGELOG.md

* Moved CSS to own stylesheet file

- Fixed Safari image width

* align bokehJS and bokeh python versions (#489)

* align bokehJS and bokeh python versions
Fixes #488

* added comment for bokeh to pyproject.toml

* replace run config .py files with .yaml (#483)

* replace run config .py files with .yaml
Changes the pipeline run configuration file format to YAML.
Implements a strict YAML validator.
Closes #479.

* update docs

* updated CHANGELOG

* Add pipeline/config.py to code reference nav.

* Fix UI arrow file set to False.

* Apply suggestions from code review

Co-authored-by: Adam Stewart <[email protected]>

* docs and docstring fixes

* improve pipeline config error messages

* background section always given in run config
Run config template will now always contain the background section. If source monitoring is off, it will be commented out.

* more specific run config validation messages

* lock deps

Co-authored-by: Adam Stewart <[email protected]>
Co-authored-by: Adam Stewart <[email protected]>

* Added announcement bar update step to making a release doc

* Fix bokeh hover tool for lightcurve plots (#493)

* fix missing lightcurve hovertool for non-variables

* add default x-range of 1 day to lightcurve

* updated CHANGELOG
Fixes #490

* Added lightgallery support

* Complete lightgallery support and more

* Fixed docstring in config.py.
* Updated all image captions.
* Split off `DataTables` and `Tags and Favourites` to their own pages.
* Expanded documentation of the documentation.
* Enclosed all file examples in example admonition.
* Renamed docs/quickstart -> docs/gettingstarted.
* Added acknowledgements page.
* Moved FAQs to before Reference.
* Fixed MathJax and instant navigation.

* fix broken cone search when coord = 0 (#501)

* fix broken cone search when coord = 0
Fixes #495

* updated CHANGELOG

* Change arrow file generation from vaex to pyarrow

* Removed vaex dependency.
* Removed UI warnings of arrow files not being generated when the run is processed via the website.
* Updated docs.

Fixes #497.

* Minor fixes

* added cos(dec) scaling to aladin lite footprint (#502)

* Renamed FAQs -> FAQ



* Added missing lightgallery css and js files



* Added "Relations False Variability" section to association page



* fix "back to top" TOC link

The code to add a "back to top" link to the table of contents would fail with an error message on pages that don't have a TOC.
The link was also never added if the user started on a page without a TOC as the mkdocs-material instant loading feature prevents JS from being executed when navigating between local docs pages. This patch ensures the link is always added.

* Update docs/design/association.md

Co-authored-by: Andrew O'Brien <[email protected]>

* set default auto field model (#507)

* set default auto field model
Silences warnings introduced in Django 3.2. Closes #505.
 since this doesn't change the model, it only tells Django what it already is.

* updated CHANGELOG

* Add glob expression support to yaml run config (#504)

* added support for globs in yaml run config
Input files can be specified with one or more glob expressions. See docs for details.
Resolved input file paths are matched and printed to the run log.
Closes #485
updated changelog

* fixed missing background images from run log

* wrap config snippets in example admonishment

Apply suggestions from code review.

Co-authored-by: Adam Stewart <[email protected]>

* add docstring and fix docs

* fix outdated ref in tests docs

Co-authored-by: Adam Stewart <[email protected]>

* add tmate action debugging (#512)

Enable by triggering a manual test run. See https://github.com/marketplace/actions/debugging-with-tmate


* Revert "add tmate action debugging (#512)" (#513)

This reverts commit 836e5a6.

* Draw selected source components on top in JS9 (#508)

* Draw selected source components on top in JS9
Fixes #200
 no changes covered by tests

* update changelog


* changed to single JS9 shape layer
JS9 only allows the top shape layer to respond to mouse events. Reverted
back to a single layer with the shapes internally ordered.


* Fix rogue relations (#510)

- Fixed rogue relations
- Fixed docstring typos.
- Replace deprecated pandas kwarg
- Update expected regression test data
- Updated changelog to reflect test changes
- Refactored test_most_relations() to sort by relations and RA.

Co-authored-by: Andrew O'Brien <[email protected]>

* Added created and updated dates to doc pages. (#514)

* Added mkdocs-git-revision-date-localized-plugin to docs

* Added date created to pages with override

- Also changed docs workflow to fetch with fetch-depth: 0.

* Updated CHANGELOG.md

* Updated documentation docs page

* Corrected tree on docs page

* remove linebreaks from coordinates (#515)

* remove linebreaks from coordinates
Removes the excess whitespace between the lat and lon coordinates which was being included in the clipboard copy text.


* pin djangorestframework-datatables to 0.5.1 (#517)

* pin djangorestframework-datatables to 0.5.1
Later versions give the wrong total number of results during pagination. It only seems to affect cases when a different model is paginated, e.g. the list of measurements of a run returned by views.RunViewSet.measurements.


* update doc related to default dev branch (#521)

* update doc related to default dev branch

Co-authored-by: Andrew O'Brien <[email protected]>

* Fixed external search results table nan values (#523)

Co-authored-by: Andrew O'Brien <[email protected]>

* Updated mkdocs-material for native creation date support (#518)

Co-authored-by: Andrew O'Brien <[email protected]>

* removed actions ssh key
forced-phot is now public and doesn't require authentication to clone 

* Updated source detail page layout (#526)

* Updated source detail page layout

- Added first detection JS9 at the top of the page.
- Extended the external results row count.
- Swap position of JS9 and aladin on measurement detail page to match.

* Minor tidy of new lines

- 

* fixed broken image detail link (#528)

* fixed broken image detail link
Fixes #527 

* updated changelog

* remove action git config to use ssh
With no private repo dependencies, we can use the default anonymous https instead.

* Bump lodash from 4.17.20 to 4.17.21 (#525)

Bumps [lodash](https://github.com/lodash/lodash) from 4.17.20 to 4.17.21.
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](lodash/lodash@4.17.20...4.17.21)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump hosted-git-info from 2.8.8 to 2.8.9 (#524)

Bumps [hosted-git-info](https://github.com/npm/hosted-git-info) from 2.8.8 to 2.8.9.
- [Release notes](https://github.com/npm/hosted-git-info/releases)
- [Changelog](https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md)
- [Commits](npm/hosted-git-info@v2.8.8...v2.8.9)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Versioning (#532)

* add versioning and updated release docs
Closes #492

* add version to run pipeline log
Closes #529

* updated migration docs
We should no longer instruct devs to squash migrations since the pipeline models are mostly stable and large runs exist

* exclude version file from docs

* added discussions and defines issues and PRs

* updated changelog


* simplify git flow description


* add version to page footer


* updated npm deps; removed bootstrap (#533)

* updated npm deps; removed bootstrap
Updated npm deps according to semver rules.
Removed bootstrap in favour of the version bundled with startbootstrap-sb-admin-2.

* updated changelog

* fixed breadcrumb truncation


* Update homepage text (#534)

* Update homepage text

- Reference to documentation.
- Removed reference to users not being able to run pipeline runs.
- Added version to homepage.

* Minor change to homepage text


* added help and acknowledgement doc page (#535)

* added help and acknowledgement doc page
Updated acknowledgements
Removed email link from docs footer
Closes #522

* docs: fix email links, removed slack from footer

* updated changelog


* changed source naming convention (#536)

* changed source naming convention
Replaced ASKAP_<coord> naming convention with J<coord>. Coords have an RA precision of 1 and a Dec precision of 0, matching the convention used in the VAST-P1 paper.
Sources can be searched by name with an optional "VAST" prefix which is silently removed before searching.
Includes a data migration to update existing source names.

* updated changelog
Closes #530

* remove redundant types from docstrings
Co-authored-by: Adam Stewart <[email protected]>

* fixed formatting of examples in docstrings

* define reverse data migration

* Removed survey source models, commands and references (#538)

* Removed survey source models, commands and references

- Removed `Survey`, `SurveySource` and `SurveySourceQuerySet` from models.
- Removed related fields from Source model and crossmatch relation.
- Removed `importsurvey` command and docs page.
- Updated schema diagrams.

Non related minor fixes:
- Updated command line docs page for `clearpiperun` command.
- Added type annotation to `remove_forced_meas()` in `forced_extraction.py`.

Fixes #462.

* Added merged migration file

* delete leftover refs to removed models

* added required fields to SourceSerializer

* Removed default_survey option from run config
- Also removed `SURVEYS_WORKING_DIR` from settings and env.
- Removed references from documentation.

Co-authored-by: Andrew O'Brien <[email protected]>

* prep for release v1.0.0

* updated release docs


Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Adam Stewart <[email protected]>
Co-authored-by: Shibli Saleheen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run configuration implementation security issue
2 participants