Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: Refactor window dataset, add tests #654

Merged
merged 3 commits into from
Mar 19, 2023

Conversation

NickleDave
Copy link
Collaborator

Fixes #653

Refactor:

- Refactor into sub-package with two modules:
  `class_` module with `WindowDataset` class,
  and module with `helper` functions
- Rename vectors + parameters `spect_id`, `spect_inds_vector`,
  `x_inds` -> `source_ids`, `source_inds`, `window_inds`
- use helper functions in `vak.core.learncurve.train_dur_csv_paths`
- rewrite docstrings, add type hints to signatures

Write tests

- Define ALL_GENERATED_LEARNCURVE_CONFIGS as constant
  in fixtures/config.py to be able to import elsewhere
- Define GENERATED_RESULTS_DATA_ROOT as constant
  in fixtures/test_data.py to be able to import elsewhere
- Remove constants in tests/test_core/test_train_dur_csv_paths.py
  Not needed after adding defaults for `spect_key`
  and `timebins_key` in `vak.core.learncurve.train_dur_csv_paths`
- Add fixtures for testing WindowDataset.
  To test WindowDataset we need to get the path
  to a previous run of learncurve so we have
  examples of the vectors used to represent
  windows in a dataset.
  - Add hard-coded list MODELS to fixtures/model.py
  - Add fixtures/results.py with constants,
    GENERATED_LEARNCURVE_RESULTS
- Add tests/test_datasets/test_window_dataset/
- Revise docstrings, add type hints, defaults
  in learncurve/train_dur_csv_paths.py
- CLN: Used named kwargs in places in core/learncurve/
  In learncurve.py and train_dur_csv_paths.py,
  to avoid passing wrong value to wrong parameter.
- CLN: Rename module/function in vak/core/learncurve
  - Rename `vak.core.learncurve.train_dur_csv_paths` -> `splits`
    because what we care about are the splits themselves,
    not the fact they are represented as csv paths
    and organized by training set durations
  - Rename `_dict_from_dur` to `from_previous_run_path`
    since this is what the function really does and
    there's no need to have it be "private"
- Move the two test modules into tests/test_core/test_learncurve/
  and rename test_learncurve/test_train_dur_csv_paths ->
  test_learncurve/test_splits
@NickleDave NickleDave merged commit d3dd7ae into main Mar 19, 2023
@NickleDave NickleDave deleted the clean-window-dataset-add-tests branch March 19, 2023 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CLN: Clean up WindowDataset, add unit tests
1 participant