-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[324] Fix rev dep #325
[324] Fix rev dep #325
Conversation
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis update introduces a comprehensive suite of changes to enhance the functionality and reliability of an R package. Key improvements include the addition of automated testing workflows via GitHub Actions, refinements in package dependencies, updates to model evaluation metrics, and enhancements to prediction logic. Furthermore, new tests and scripts for backward compatibility and model validation have been established, ensuring that the package maintains its integrity across updates. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant GitHub Actions
participant R Environment
participant Package
User->>GitHub Actions: Push changes
GitHub Actions->>R Environment: Run tests and build
R Environment->>Package: Execute `make readme`
Package->>R Environment: Validate README changes
R Environment-->>GitHub Actions: Report results
GitHub Actions-->>User: Notify results
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range, codebase verification and nitpick comments (4)
inst/data-raw/test-all_models.R (2)
25-40
: Consider documenting the exclusion criteria forjava_models
.While the exclusion of certain models is clear, adding a comment to explain why these specific models are excluded could improve the maintainability and clarity of the code.
+ # Excluding models that require Java or are otherwise problematic java_models <- c( "gbm_h2o", "glmnet_h2o", ... )
84-106
: Thorough testing of model predictions.The test block effectively checks both the ability to predict and the handling of stacked predictions. The use of
expect_identical
andexpect_true
ensures that predictions are as expected. However, consider adding more detailed comments to explain the purpose of each test step, especially regarding the handling of models that predictInf
values.+ # Ensure predictions are made for all models and check for finite values testthat::expect_identical(nrow(pred), 5L) testthat::expect_identical(ncol(pred), length(all_models)) testthat::expect_true(all(unlist(lapply(pred, is.finite))))
inst/data-raw/build_backwards_compatability_data.R (2)
1-9
: Clarify the purpose and usage of the script.The initial comments provide context about the script's purpose but could benefit from more structured documentation. Consider adding a brief summary of the script's functionality and intended usage.
+ # This script isolates and prunes a problematic model for testing backward compatibility. # This script is a little big sorry. # We're using a 3rd party dataset from a package ...
53-83
: Iterative pruning logic is well-implemented.The
prune_list_iterative
function systematically removes elements from the model object. The logic for determining whether to keep or remove elements is sound. However, consider adding comments to explain the decision-making process in more detail.+ # If error changes or warnings appear, retain the element; otherwise, remove it if ((!is.null(result$error) && result$error != "is.vector(pred) is not TRUE") || !is.null(result$wrns)) { ... }
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files ignored due to path filters (2)
man/figures/README-greedy-stack-6-plot-1.png
is excluded by!**/*.png
man/figures/README-unnamed-chunk-5-1.png
is excluded by!**/*.png
Files selected for processing (11)
- .github/workflows/readme.yaml (1 hunks)
- DESCRIPTION (1 hunks)
- Makefile (4 hunks)
- R/caretPredict.R (1 hunks)
- README.md (5 hunks)
- README.rmd (1 hunks)
- inst/WORDLIST (2 hunks)
- inst/data-raw/build_backwards_compatability_data.R (1 hunks)
- inst/data-raw/test-all_models.R (1 hunks)
- tests/testthat/test-backwards-compatability.R (1 hunks)
- tests/testthat/test-caretList.R (1 hunks)
Files skipped from review due to trivial changes (1)
- inst/WORDLIST
Additional comments not posted (28)
.github/workflows/readme.yaml (4)
1-5
: Triggers are correctly set.The workflow triggers on push and pull request events for the
main
andmaster
branches, which is appropriate for CI workflows.
7-14
: Job setup follows best practices.The job is named
Tests
, runs onubuntu-latest
, and includes steps for checking out the repository and setting up R, which are standard practices.
14-19
: R environment setup is correctly implemented.The workflow uses
r-lib/actions/setup-r
to set up R andr-lib/actions/setup-r-dependencies
to install additional packages, which is a standard approach.
20-30
: README check is a useful addition.The step checks if
README.md
has changed and exits with an error if it has, ensuring documentation consistency.tests/testthat/test-backwards-compatability.R (1)
1-25
: Backward compatibility test is well-structured.The test effectively ensures that old models are loaded correctly and predictions are made in the expected format, covering key aspects of backward compatibility.
DESCRIPTION (5)
24-24
: Addition ofearth
toSuggests
aligns with PR objectives.The
earth
package is added to support features or testing related toearth
models, as mentioned in the PR summary.
27-27
: Addition ofklaR
toSuggests
aligns with PR objectives.The
klaR
package is added to support compatibility withklar
models, as mentioned in the PR summary.
30-30
: Addition ofmgcv
toSuggests
likely supports improvements.The
mgcv
package is added, possibly to support new features or testing requirements, aligning with the overall improvements.
33-33
: Addition ofpkgdown
toSuggests
enhances documentation capabilities.The
pkgdown
package is added to support documentation generation, which is beneficial for package maintenance.
40-40
: Removal ofusethis
fromSuggests
aligns with updated requirements.The
usethis
package is removed, possibly reflecting a shift in development practices or reduced need.inst/data-raw/test-all_models.R (3)
8-10
: Use ofsuppressWarnings
andsuppressMessages
is appropriate here.The function
very_quiet
is designed to suppress warnings and messages, which is suitable for testing scenarios where output noise should be minimized. This is a good practice for ensuring that only relevant test failures are reported.
15-24
: Data setup is clear and concise.The data setup section uses a fixed seed for reproducibility, which is essential for consistent test results. The creation of the
X
matrix andy
vector is straightforward and well-documented.
60-65
: Model exclusion rationale is well-considered.The exclusion of models that are too slow or cannot be installed is a practical approach to ensure the tests run efficiently. This helps focus on models that are feasible to test in a reasonable timeframe.
inst/data-raw/build_backwards_compatability_data.R (3)
24-51
: Robust error and warning handling.The
test_error
function captures both errors and warnings effectively. The use oftryCatch
andwithCallingHandlers
is appropriate for isolating issues within the model. This approach is well-suited for debugging complex models.
108-115
: Ensure saved model file path consistency.The
saveRDS
function saves the pruned model to a specific path. Ensure that this path remains consistent with the project's directory structure and update it if necessary.Verification successful
File path consistency verified.
The directory
tests/testthat/data
exists, and there are no conflicting references to this path elsewhere in the codebase. The path used in thesaveRDS
function is consistent with the project's directory structure.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the existence of the saved model file path. # Test: Check if the directory path for saving the model exists. if [ ! -d "tests/testthat/data" ]; then echo "Directory tests/testthat/data does not exist. Please create it or update the path." exit 1 fiLength of output: 36
Script:
#!/bin/bash # Search for references to the path "tests/testthat/data" in the codebase rg "tests/testthat/data"Length of output: 24
11-12
: Ensure package version availability.The script installs a specific version of
LDLcalc
. Ensure that this version remains available on CRAN or consider alternative methods to handle potential unavailability in the future.Makefile (4)
24-24
: Addition ofcheck-many-preds
target is beneficial.Including the
check-many-preds
target in the help section provides clarity on its purpose and ensures users are aware of this new functionality.
139-142
: Integration ofcheck-many-preds
target.The
check-many-preds
target runs a script to validate predictions on numerous models. This integration is a valuable addition to ensure model reliability before release.
154-154
: Ensurecheck-many-preds
is executed during release.By including
check-many-preds
in the release process, you ensure that model predictions are validated, enhancing the robustness of the release.
166-166
: Improved cleanup with.Rout
removal.The addition of
.Rout
files to theclean
target helps maintain a tidy workspace by removing temporary files generated during R script execution.README.rmd (1)
42-42
: Good practice: Set seed for reproducibility.The addition of
set.seed(42L)
ensures that the random sampling of thediamonds
dataset is reproducible, which is a good practice for examples and tests.README.md (3)
72-82
: Update in RMSE and weights for greedy ensemble.The RMSE and model weights in the greedy ensemble have been updated. Verify these changes align with updates in the model or training data.
Verification successful
Verification Successful: RMSE and Weights Updates in README.md
The updates to the RMSE and weights for the greedy ensemble in the README.md align with the underlying code and tests. The
greedyMSE
function handles these calculations, and the test cases verify their accuracy. No discrepancies found.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify consistency of RMSE and weights updates in greedy ensemble. # Test: Search for changes in ensemble training or data affecting RMSE and weights. rg --type r 'caretEnsemble'Length of output: 10670
Script:
#!/bin/bash # Search for RMSE calculations in the context of the greedy ensemble rg 'RMSE' -A 5 # Search for weight assignments in the context of the greedy ensemble rg 'Weights' -A 5Length of output: 12637
52-53
: Update in RMSE values for model accuracy.The RMSE values for the random forest and glmnet models have been updated, indicating changes in model performance. Ensure these updates are consistent with any changes in the model or data.
Line range hint
101-115
: Update in RMSE and variance explained for non-linear ensemble.The RMSE and variance explained in the non-linear ensemble have been updated. Ensure these reflect changes in model configuration or data.
Verification successful
Verification Successful: RMSE and Variance Explained Updates Reflected
The README.md file correctly reflects the updated RMSE and variance explained for the non-linear ensemble using
caretStack
. The values are consistent with recent changes in the codebase.
- RMSE: 1020.387
- Variance Explained: 93.33%
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify consistency of RMSE and variance updates in non-linear ensemble. # Test: Search for changes in non-linear ensemble affecting RMSE and variance. rg --type r 'caretStack'Length of output: 12143
Script:
#!/bin/bash # Description: Search for recent changes in the codebase related to RMSE and variance explained. # Look for recent commits that mention RMSE or variance. git log -p -S 'RMSE' -- README.md git log -p -S 'variance explained' -- README.mdLength of output: 7750
tests/testthat/test-caretList.R (1)
295-326
: Enhanced test coverage forcaretList
with array/matrix models.The new test case effectively validates
caretList
functionality with models that return arrays or matrices, such as "earth", "gam", and "stepLDA". This addition is well-structured and improves test coverage.R/caretPredict.R (3)
31-32
: Validate data frame output for probability predictions.The
stopifnot
statement ensures that the probability predictions are returned as a data frame. This is a good practice to validate the output type.
34-38
: Validate numeric vector output for raw predictions.The
stopifnot
and conversion to a vector ensure that raw predictions are numeric vectors. This is important for backward compatibility with older models.
27-29
: Ensure matrix conversion is necessary.The conversion of
newdata
to a matrix for "neuralnet" and "klaR" libraries is a necessary adjustment. Ensure that this conversion is consistently required for these libraries across all scenarios.
Fix for #324
https://github.com/zachmayer/caretEnsemble/actions/runs/10276799499/job/28438114871#step:4:46
The new caretEnsemble package breaks some old, saved caret::train models that are earth::earth models.
This PR adds a serialized version of such a model as well as a backwards compatibility test to make sure it works. I also added a line of code to fix the test.
There's also a fix for some packages that require matricies at prediction time.
Also add a test as part of the release process that we can do a caretList and predict for basically all models in caret.