-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add_traceability errors #19
Comments
Good catches @kyleam, and yes I think we should address this before the first release. At the very least, we should capture these errors and print a "could not render Traceability Matrix" in the doc, instead of blowing up. Let's chat about some of this and then document the outcome of that in this issue. Thanks. |
I think we made an incorrect assumption that the man files should also map to an R file as well. The Cant decide if we should just ditch the warning, or have a In package `MASS`, could not find an R script associated with man file: abbey.Rd
In package `MASS`, could not find an R script associated with man file: accdeaths.Rd
In package `MASS`, could not find an R script associated with man file: addterm.Rd
In package `MASS`, could not find an R script associated with man file: Aids2.Rd
In package `MASS`, could not find an R script associated with man file: Animals.Rd
In package `MASS`, could not find an R script associated with man file: anorexia.Rd
In package `MASS`, could not find an R script associated with man file: anova.negbin.Rd
In package `MASS`, could not find an R script associated with man file: area.Rd
In package `MASS`, could not find an R script associated with man file: bacteria.Rd
In package `MASS`, could not find an R script associated with man file: bandwidth.nrd.Rd
In package `MASS`, could not find an R script associated with man file: bcv.Rd
In package `MASS`, could not find an R script associated with man file: beav1.Rd
In package `MASS`, could not find an R script associated with man file: beav2.Rd
In package `MASS`, could not find an R script associated with man file: biopsy.Rd
In package `MASS`, could not find an R script associated with man file: birthwt.Rd
In package `MASS`, could not find an R script associated with man file: Boston.Rd
In package `MASS`, could not find an R script associated with man file: boxcox.Rd
... keeps going ...
In package `MASS`, the R scripts (R/add.R, R/boxcox.R, R/contr.sdif.R, R/corresp.R, R/cov.trob.R, R/dose.p.R,
R/eqscplot.R, R/fitdistr.R, R/fractions.R, R/gamma.shape.R, R/glmmPQL.R, R/hist.scott.R, R/huber.R, R/hubers.R,
R/isoMDS.R, R/kde2d.R, R/lda.R, R/lm.ridge.R, R/loglm.R, R/logtrans.R, R/lqs.R, R/mca.R, R/misc.R, R/neg.bin.R,
R/negbin.R, R/negexp.R, R/parcoord.R, R/qda.R, R/rlm.R, R/sammon.R, R/stdres.R, R/ucv.R, R/write.matrix.R)
are missing documentation for the following exports:
dropterm
boxcox
contr.sdif
... keeps going ... |
Perhaps that's behind a lot (or even all?) of the noise I reported in gh-21. |
Yes it definitely is. I am currently fixing this with a
|
…defined on the next line: - previously, functions defined as follows could not be captured because they were separated into multiple elements: hello <- function(x){} - by collapsing to a single string, the optional white space now works for new lines as well. This was the core issue for the `Exact` package basename error in #19. - additional testing of this is required.
…hat were written by hand, and not generated via roxygen. In these cases the man files did not reference the R script the function originated in. - We could just say these packages dont have documentation/were not found, but I added support for at least attempting to find the scripts. - In this commit, I look for the function name as an R file if none were found. In other words, man/my_function.Rd would lead to a search for R/my_function.R. We could further improve this searching, but Im sure we dont want to go crazy with it either. This is a minimal patch that works for the `Exact (3.2)` mentioned in #19. (basename error)
- was the issue for Cairo 1.6-0 in #19
@barrettk I know you're leaning toward using an eval-based approach for a getting a set of top-level functions declared in a file. If we wanted to avoid evaluating, I think we could get pretty far without diving too deep into the parse data. In case it's a useful start, here's an example. This could miss some unusual metaprogramming-based declarations and might need to be extended for other constructs of interest, but it may be good enough for our purposes. code#' Return top-level functions and generics from files
#'
#' @param files Character vector of files to parse and search.
#'
#' @return A list the same length as `files`. Each element is the set of
#' functions and generics found in the given file. If `files` is a named
#' vector, carry those over to the return value. Otherwise set the names to
#' the relative file names.
#' @noRd
get_toplevel_declarations <- function(files) {
if (is.null(names(files))) {
names(files) <- fs::path_rel(files)
}
purrr::map(
files,
function(f) {
pd <- tryCatch(parse(file = f, keep.source = TRUE), error = identity)
if (inherits(pd, "error")) {
warning("Failed to parse ", f, ": ", conditionMessage(pd))
return(character())
}
data <- utils::getParseData(pd)
c(
toplevel_functions(data),
toplevel_generics(data)
)
}
)
}
toplevel_functions <- function(data) {
# Find top-level functions that have the following parse structure:
#
# root (id = 0)
# `--expr -> node1
# ¦--expr -> node2
# ¦ `--SYMBOL: {name} -> node3
# ¦--LEFT_ASSIGN or EQ_ASSIGN
# `--expr
# ¦--FUNCTION
# [...]
extract_toplevel_names(
data,
select_fn = function(d) d$id[d$token == "FUNCTION"],
depth = 3L,
name_fn = function(d, id) {
node1 <- node_parent(d, id, 2)
n1_kids <- node_children(d, node1)
if (length(n1_kids) != 3) {
return(NA_character_)
}
node2 <- n1_kids[1]
node3 <- node_children(d, node2)
if (length(node3) != 1) {
return(NA_character_)
}
row <- d[match(node3, d$id), ]
if (!identical(row$token, "SYMBOL")) {
return(NA_character_)
}
return(row$text)
}
)
}
toplevel_generics <- function(data) {
# Find top-level setGeneric calls that have the following parse structure:
#
# root (id = 0)
# `--expr -> node1
# ¦--expr
# ¦ `--SYMBOL_FUNCTION_CALL (setGeneric)
# ¦--'('
# ¦--expr -> node2
# ¦ `--STR_CONST: "{name}" -> node3
# [...]
# `--')
extract_toplevel_names(
data,
select_fn = function(d) {
is_gen <- d$token == "SYMBOL_FUNCTION_CALL" & d$text == "setGeneric"
data$id[is_gen]
},
depth = 3L,
name_fn = function(d, id) {
node1 <- node_parent(d, id, 2)
n1_kids <- node_children(d, node1)
if (length(n1_kids) < 4) {
return(NA_character_)
}
node2 <- n1_kids[3]
node3 <- node_children(d, node2)
if (length(node3) != 1) {
return(NA_character_)
}
row <- d[match(node3, d$id), ]
if (!identical(row$token, "STR_CONST")) {
return(NA_character_)
}
text <- row$text
# For very long strings (1000 characters or more), `text` is replaced by
# `[N chars quoted with '"']`, and `getParseText()` can be used to extract
# it. In the context of this function, we're not expecting or interested
# in anything that long.
if (startsWith(text, "[")) {
return(NA_character_)
}
# Drop outer quotes.
return(substr(text, 2, nchar(text) - 1))
}
)
}
#' Extract top-level names
#'
#' @param data Data frame of parse data, as returned by `utils::getParseData()`.
#' @param select_fn Function to select nodes of interest from parse data. This
#' is called with `data` and should return a character vector of node IDs.
#' @param depth Consider a node returned by `select_fn` to be "top-level" if it
#' is at this depth relative to the root node.
#' @param name_fn Function that returns a name for the specified selected node.
#' This function is called with two arguments, `data` and a top-level node ID
#' returned by `select_fn`. If, on closer inspection, the passed node is not a
#' node of interest, this function should return `NA_character_` to indicate
#' that the value should be filtered out from the return value.
#'
#' @return A character vector of names.
#' @noRd
extract_toplevel_names <- function(data, select_fn, depth, name_fn) {
nodes <- purrr::keep(
select_fn(data),
function(id) identical(node_parent(data, id, depth), 0L)
)
nms <- purrr::map_chr(nodes, function(id) name_fn(data, id))
return(nms[!is.na(nms)])
}
node_parent <- function(data, id, n = 1) {
while (n > 0) {
id <- data$parent[match(id, data$id)]
n <- n - 1
}
return(id)
}
node_children <- function(data, id) {
data$id[data$parent == id]
} |
[ cc: @seth127 @barrettk ]
I started to run
render_scorecard
withadd_traceability = TRUE
and hit into many errors. Before aborting, I let it run for a bit (ended up being 79 packages) to hopefully capture a good representation of the different types of errors.Here are the different classes of errors that I spot. The packages listed under each class are just what happened to be in that first set of 79 packages.
no
testthat
directory foundpackages: AER (1.2-10), BB (2019.10-1), DEoptimR (1.0-14), DEoptim (2.2-8), DT (0.28), DistributionUtils (0.6-0), Formula (1.2-5), GPArotation (2023.3-1), KernSmooth (2.23-22), MASS (7.3-60), MatrixModels (0.5-2), Matrix (1.6-0), PKI (0.1-12), PerformanceAnalytics (2.0.4)
no testing directory found
packages: ATE (0.2.0), AUC (0.3.2), BiasedUrn (2.0.10), DiagrammeRsvg (0.1), DiceDesign (1.9), Epi (2.47.1), FME (1.3.6.3), FNN (1.1.3.2), GA (3.2.3), GenSA (1.1.9), HDInterval (0.2.4), JM (1.5-2), JMbayes (0.8-85), Lahman (11.0-0), MCMCpack (1.6-3), MatchIt (4.5.4), NMF (0.26), PKPDdatasets (0.1.0.1494604782), PKPDmodels (0.3.2), PK (1.3-5)
basename error
packages: ALEPlot (1.1), Cairo (1.6-0), CompQuadForm (1.4.3), Exact (3.2), KMsurv (0.1-5), Kendall (2.2.1), LearnBayes (2.15.1), NonCompart (0.6.0), PEIP (2.2-3), PFIM (5.0), PKreport (1.5)
.init
errorpackages: AsioHeaders (1.22.1-2), BH (1.81.0-1)
invalid regular expression
packages: DescTools (0.99.49), Hmisc (4.7-2)
I'm putting this in "First Release" milestone given that gh-7 has "[s]et add_traceability = TRUE to the default argument".
The text was updated successfully, but these errors were encountered: