Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics library updated to support comma separate inputs #900

Conversation

scap3yvt
Copy link
Collaborator

Fixes #N.A.

Proposed Changes

  • input csv can now be split into target and prediction csvs
  • multiple sanity checks to ensure coherence

Checklist

  • CONTRIBUTING guide has been followed.
  • PR is based on the current GaNDLF master .
  • Non-breaking change (does not break existing functionality): provide as many details as possible for any breaking change.
  • Function/class source code documentation added/updated (ensure typing is used to provide type hints, including and not limited to using Optional if a variable has a pre-defined value).
  • Code has been blacked for style consistency and linting.
  • If applicable, version information has been updated in GANDLF/version.py.
  • If adding a git submodule, add to list of exceptions for black styling in pyproject.toml file.
  • Usage documentation has been updated, if appropriate.
  • Tests added or modified to cover the changes; if coverage is reduced, please give explanation.
  • If customized dependency installation is required (i.e., a separate pip install step is needed for PR to be functional), please ensure it is reflected in all the files that control the CI, namely: python-test.yml, and all docker files [1,2,3].

@scap3yvt scap3yvt requested a review from a team as a code owner July 11, 2024 14:31
Copy link
Contributor

github-actions bot commented Jul 11, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@sarthakpati sarthakpati marked this pull request as draft July 11, 2024 14:38
@sarthakpati
Copy link
Collaborator

Marking this as draft until this gets checked by @vpchung.

@scap3yvt
Copy link
Collaborator Author

Hi @sarthakpati @VukW - the entrypoint tests are failing with this error:

2024-07-11T18:00:00.8453595Z testing/entrypoints/__init__.py:288: AssertionError
2024-07-11T18:00:00.8454648Z ----------------------------- Captured stdout call -----------------------------
2024-07-11T18:00:00.8455623Z Test failed on the new case: -i path_na -o output.json -c config.yaml

I am sure why this is the case. Could one of you please help?

@sarthakpati
Copy link
Collaborator

I believe this is happening because of this change, which allows a generic string to be passed:

image

This specific case: "-i path_na -o output.json -c config.yaml" needs to be removed from the tests because this will actually fail, since click is no longer checking for the validity of the input to -i.

Copy link

@vpchung vpchung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have a couple of suggestions, but they're minor.

Comment on lines +111 to +113
target_df = target_df.rename(columns={target_df.columns[1]: "Target"})
prediction_df = prediction_df.rename(
columns={prediction_df.columns[1]: "Prediction"}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that the second column is always the one that contains the target or prediction values. Perhaps an assertion can be added before this to ensure that the first column is "SubjectID" before renaming the second column (on the off-chance that "SubjectID" is second column instead of the first).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be addressed - can you please check?

Copy link

codecov bot commented Jul 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.45%. Comparing base (c331503) to head (5832c03).
Report is 62 commits behind head on new-apis_v0.1.0-dev.

Additional details and impacted files
@@                   Coverage Diff                   @@
##           new-apis_v0.1.0-dev     #900      +/-   ##
=======================================================
+ Coverage                94.40%   94.45%   +0.04%     
=======================================================
  Files                      158      159       +1     
  Lines                     9314     9391      +77     
=======================================================
+ Hits                      8793     8870      +77     
  Misses                     521      521              
Flag Coverage Δ
unittests 94.45% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@sarthakpati sarthakpati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes in wording

GANDLF/entrypoints/generate_metrics.py Outdated Show resolved Hide resolved
GANDLF/entrypoints/generate_metrics.py Outdated Show resolved Hide resolved
@sarthakpati sarthakpati marked this pull request as ready for review July 12, 2024 23:42
@sarthakpati sarthakpati merged commit 3e02cec into new-apis_v0.1.0-dev Jul 12, 2024
25 checks passed
@sarthakpati sarthakpati deleted the new-apis_v0.1.0-dev_metrics_support_separate_target-pred_files branch July 12, 2024 23:43
@github-actions github-actions bot locked and limited conversation to collaborators Jul 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants