Skip to content

Commit

Permalink
Merge pull request #734 from njtierney/tweaking-wishart-fix-from-729
Browse files Browse the repository at this point in the history
Tweaking wishart fix from 729
  • Loading branch information
njtierney authored Nov 6, 2024
2 parents cb14e95 + 118ee26 commit 80192ed
Show file tree
Hide file tree
Showing 21 changed files with 462 additions and 340 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Imports:
R6,
reticulate (>= 1.19.0),
rlang,
tensorflow (>= 2.8.0),
tensorflow (== 2.16.0),
tools,
utils,
whisker,
Expand Down
56 changes: 35 additions & 21 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,51 +28,65 @@ The following optimisers are removed, as they are no longer supported by Tensorf

## Installation revamp

This release provides a few improvements to installation in greta. It should now provide more information about installation progress, and be more robust. The intention is, it should _just work_, and if it doesn't fail gracefully with some useful advice on problem solving.
This release provides a few improvements to installation in greta. It should now provide more information about installation progress, and be more robust. The intention is, it should _just work_, and if it doesn't, it should fail gracefully with some useful advice on problem solving.

* Added option to restart R + run `library(greta)` after installation (#523)
* Added installation deps object, `greta_deps_sepc()` to help simplify specifying package versions (#664)
* removed `method` and `conda` arguments from `install_greta_deps()` as they
* Added option to restart R + run `library(greta)` after installation (#523).
* Added installation deps object, `greta_deps_sepc()` to help simplify specifying package versions (#664).
* Removed `method` and `conda` arguments from `install_greta_deps()` as they
were not used.
* removed `manual` argument in `install_greta_deps()`.
* added default 5 minute timer to installation processes
* Added `greta_deps_receipt()` to list the current main python packages installed. (#668)
* Added checking suite to ensure you are using valid versions of TF, TFP, and Python(#666)
* Removed `manual` argument in `install_greta_deps()`.
* Added default 5 minute timer to installation processes.
* Added `greta_deps_receipt()` to list the current main python packages installed (#668).
* Added checking suite to ensure you are using valid versions of TF, TFP, and Python(#666).
* Added data `greta_deps_tf_tfp` (#666), which contains valid versions combinations of TF, TFP, and Python.
* remove `greta_nodes_install/conda_*()` options as #493 makes them defunct.
* Added option to write to a single logfile with `greta_set_install_logfile()`, and `write_greta_install_log()`, and `open_greta_install_log()` (#493)
* Added `destroy_greta_deps()` function to remove miniconda and python conda environment
* Improved `write_greta_install_log()` and `open_greta_install_log()` to use `tools::R_user_dir()` to always write to a file location. `open_greta_install_log()` will open one found from an environment variable or go to the default location. (#703)
* Remove `greta_nodes_install/conda_*()` options as #493 makes them defunct.
* Added option to write to a single logfile with `greta_set_install_logfile()`, and `write_greta_install_log()`, and `open_greta_install_log()` (#493).
* Added `destroy_greta_deps()` function to remove miniconda and python conda environment.
* Improved `write_greta_install_log()` and `open_greta_install_log()` to use `tools::R_user_dir()` to always write to a file location. `open_greta_install_log()` will open one found from an environment variable or go to the default location (#703).

## New Print methods

* New print method for `greta_mcmc_list`. This means MCMC output will be shorter and more informative. (#644)
* greta arrays now have a print method that stops them from printing too many rows into the console. Similar to MCMC print method, you can control the print output with the `n` argument: `print(object, n = <elements to print>)`. (#644)
* New print method for `greta_mcmc_list`. This means MCMC output will be shorter and more informative (#644).
* greta arrays now have a print method that stops them from printing too many rows into the console. Similar to MCMC print method, you can control the print output with the `n` argument: `print(object, n = <elements to print>)` (#644).

## Minor

* `greta_sitrep()` now checks for installations of Python, TF, and TFP
* `greta_sitrep()` now checks for installations of Python, TF, and TFP.
* Slice sampler no longer needs precision = "single" to work.
* greta now depends on R 4.1.0, which was released May 2021, over 3 years ago.
* export `is.greta_array()` and `is.greta_mcmc_list()`
* `restart` argument for `install_greta_deps()` and `reinstall_greta_deps()` to automatically restart R (#523)
* export `is.greta_array()` and `is.greta_mcmc_list()`.
* `restart` argument for `install_greta_deps()` and `reinstall_greta_deps()` to automatically restart R (#523).

## Internals

* Internally we are replacing most of the error handling code as separate
`check_*` functions.
* Implemented `cli::cli_abort/warn/inform()` in place of `cli::format_error/warning/message()` + `stop/warning/message(msg, call. = FALSE)` pattern.
* Uses legacy optimizer internally (Use `tf$keras$optimizers$legacy$METHOD` over `tf$keras$optimizers$METHOD`). No user impact expected.
* Update photo of Grete Hermann (#598)
* Use `%||%` internally to replace the pattern: `if (is.null(x)) x <- thing` with `x <- x %||% thing`. (#630)
* Update photo of Grete Hermann (#598).
* Use `%||%` internally to replace the pattern: `if (is.null(x)) x <- thing` with `x <- x %||% thing` (#630).
* Add more explaining variables - replace `if (thing & thing & what == this)` with `if (explanation_of_thing)`.
* Refactored repeated uses of `vapply` into functions (#377, #658)
* Refactored repeated uses of `vapply` into functions (#377, #658).
* Add internal data files `.deps_tf` and `.deps_tfp` to track dependencies of TF and TFP. Related to #666.

- Posterior density checks (#720):
- Don't run Geweke on CI as it takes 30 minutes to run.
- Add thinning to Geweke tests.
- Fix broken geweke tests from TF1-->TF2 change.
- Increase the number of effective samples for check_samples for lkj distribution
- Add more checks to posterior to run on CI/on each test of greta

## Bug fixes

* Fix bug where matrix multiply had dimension error before coercing to greta array. (#464)
*
- Fixes for Wishart and LKJ Correlation distributions (#729 #733 #734):
- Add bijection density to choleskied distributions.
- Note about some issues with LKJ and our normalisation constant for the density.
- Removed our custom `forward_log_det_jacobian()` function from `tf_correlation_cholesky_bijector()` (used in `lkj_correlation()`). Previously, it did not work with unknown dimensions, but it now works with them.
- Ensure wishart uses sigma_chol in scale_tril
- Wishart uses `tf$matmul(chol_draws, chol_draws, adjoint_b = TRUE)` instead of `tf_chol2symm(chol_draws)`.
- Test log prob function returns valid numeric numbers.
- Addresses issue with log prob returning NaNs--replace `FillTriangular` with `FillScaleTriL` and apply Chaining to first transpose input.

# greta 0.4.5

Expand Down
13 changes: 13 additions & 0 deletions R/checkers.R
Original file line number Diff line number Diff line change
Expand Up @@ -1943,6 +1943,19 @@ check_has_representation <- function(repr,
}
}

check_has_anti_representation <- function(repr,
name,
error,
call = rlang::caller_env()){
not_anti_represented <- error && is.null(repr)
if (not_anti_represented) {
cli::cli_abort(
message = "{.cls greta_array} has no anti representation {.var {name}}",
call = call
)
}
}

check_is_greta_array <- function(x,
arg = rlang::caller_arg(x),
call = rlang::caller_env()){
Expand Down
17 changes: 17 additions & 0 deletions R/greta_array_class.R
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,23 @@ has_representation <- function(x, name) {
!is.null(repr)
}

anti_representation <- function(x, name, error = TRUE) {
if (is.greta_array(x)) {
x_node <- get_node(x)
} else {
x_node <- x
}
repr <- x_node$anti_representations[[name]]
check_has_anti_representation(repr, name, error)
repr
}


has_anti_representation <- function(x, name){
repr <- anti_representation(x, name, error = FALSE)
!is.null(repr)
}

# helper function to make a copy of the greta array & tensor
copy_representation <- function(x, name) {
repr <- representation(x, name)
Expand Down
3 changes: 2 additions & 1 deletion R/greta_stash.R
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,6 @@ greta_stash$tf_num_error <- greta_note_msg
#' greta_notes_tf_error()
#' }
greta_notes_tf_num_error <- function() {
cat(greta_stash$tf_num_error)
# wrap in paste0 to remove list properties
cat(paste0(greta_stash$tf_num_error))
}
3 changes: 3 additions & 0 deletions R/inference_class.R
Original file line number Diff line number Diff line change
Expand Up @@ -744,6 +744,9 @@ sampler <- R6Class(
)

# get trace of free state and drop the null dimension
if (is.null(batch_results$all_states)){
browser()
}
free_state_draws <- as.array(batch_results$all_states)

# if there is one sample at a time, and it's rejected, conversion from
Expand Down
43 changes: 43 additions & 0 deletions R/node_class.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,12 @@ node <- R6Class(
unique_name = "",
parents = list(),
children = list(),
# named greta arrays giving different representations of the greta array
# represented by this node that have already been calculated, to be used for
# computational speedups or numerical stability. E.g. a logarithm or a
# cholesky factor
representations = list(),
anti_representations = list(),
.value = array(NA),
dim = NA,
distribution = NULL,
Expand Down Expand Up @@ -82,6 +87,19 @@ node <- R6Class(
parents <- c(parents, list(self$distribution))
}

if (mode == "sampling" & has_representation(self, "cholesky")){
# remove cholesky representation node from parents
parent_names <- extract_unique_names(parents)
antirep_name <- get_node(self$representations$cholesky)$unique_name
parent_names_keep <- setdiff(parent_names, antirep_name)
parents <- parents[match(parent_names_keep, parent_names)]
}

if (mode == "sampling" & has_anti_representation(self, "chol2symm")){
chol2symm_node <- get_node(self$anti_representations$chol2symm)
parents <- c(parents, list(chol2symm_node))
}

parents
},
add_child = function(node) {
Expand Down Expand Up @@ -273,6 +291,31 @@ node <- R6Class(
}

label
},
make_antirepresentations = function(representations){
mapply(
FUN = self$make_one_anti_representation,
representations,
names(representations)
)
},
make_one_anti_representation = function(ga, name){
node <- get_node(ga)
anti_name <- self$find_anti_name(name)
node$anti_representations[[anti_name]] <- as.greta_array(self)
node
},
find_anti_name = function(name){
switch(name,
cholesky = "chol2symm",
chol2symm = "chol",
exp = "log",
log = "exp",
probit = "iprobit",
iprobit = "probit",
logit = "ilogit",
ilogit = "logit"
)
}
)
)
Expand Down
58 changes: 34 additions & 24 deletions R/node_types.R
Original file line number Diff line number Diff line change
Expand Up @@ -84,12 +84,6 @@ operation_node <- R6Class(
operation_args = NA,
arguments = list(),
tf_function_env = NA,

# named greta arrays giving different representations of the greta array
# represented by this node that have already been calculated, to be used for
# computational speedups or numerical stability. E.g. a logarithm or a
# cholesky factor
representations = list(),
initialize = function(operation,
...,
dim = NULL,
Expand Down Expand Up @@ -127,6 +121,7 @@ operation_node <- R6Class(
self$operation <- tf_operation
self$operation_args <- operation_args
self$representations <- representations
self$make_antirepresentations(representations)
self$tf_function_env <- tf_function_env

# assign empty value of the right dimension, or the values passed via the
Expand Down Expand Up @@ -158,23 +153,23 @@ operation_node <- R6Class(
# browser()
tensor <- dag$draw_sample(self$distribution)

if (has_representation(self, "cholesky")) {
# if (has_representation(self, "cholesky")) {
# browser()
cholesky_tensor <- tf_chol(tensor)
# cholesky_tf_name <- dag$tf_name(self$representation$cholesky)
cholesky_node <- get_node(representation(self, "cholesky"))
cholesky_tf_name <- dag$tf_name(cholesky_node)
assign(cholesky_tf_name, cholesky_tensor, envir = tfe)
# cholesky_tensor <- tf_chol(tensor)
# # cholesky_tf_name <- dag$tf_name(self$representation$cholesky)
# cholesky_node <- get_node(representation(self, "cholesky"))
# cholesky_tf_name <- dag$tf_name(cholesky_node)
# assign(cholesky_tf_name, cholesky_tensor, envir = tfe)
## TF1/2
## This assignment I think is supposed to be passed down to later on
## in the script, as `cholesky_tf_name` gets overwritten
# cholesky_tf_name <- dag$tf_name(self)
# tf_name <- cholesky_tf_name
# tensor <- cholesky_tensor
cholesky_tensor <- tf_chol(tensor)
cholesky_tf_name <- dag$tf_name(self$representation$cholesky)
assign(cholesky_tf_name, cholesky_tensor, envir = dag$tf_environment)
}
# cholesky_tensor <- tf_chol(tensor)
# cholesky_tf_name <- dag$tf_name(self$representation$cholesky)
# assign(cholesky_tf_name, cholesky_tensor, envir = dag$tf_environment)
# }
}

if (mode == "forward") {
Expand Down Expand Up @@ -292,14 +287,29 @@ variable_node <- R6Class(
distrib_node <- self$distribution

if (is.null(distrib_node)) {

# if the variable has no distribution create a placeholder instead
# (the value must be passed in via values when using simulate)
shape <- to_shape(c(1, self$dim))
# TF1/2 check
# need to change the placeholder approach here.
# NT: can we change this to be a tensor of the right shape with 1s?
tensor <- tensorflow::as_tensor(1L, shape = shape, dtype = tf_float())
# does it have an anti-representation where it is the cholesky?
# the antirepresentation of cholesky is chol2symm
# if it does, we will take the antirepresentation and get it to `tf` itself
# then we need to get the tf_name
chol2symm_ga <- self$anti_representations$chol2symm
chol2symm_existing <- !is.null(chol2symm_ga)
if (chol2symm_existing) {
chol2symm_node <- get_node(chol2symm_ga)
chol2symm_name <- dag$tf_name(chol2symm_node)
chol2symm_tensor <- get(chol2symm_name, envir = dag$tf_environment)
tensor <- tf_chol(chol2symm_tensor)
}

# chol2symm_ga$define_tf(dag)
# } else {
#
# # if the variable has no distribution create a placeholder instead
# # (the value must be passed in via values when using simulate)
# shape <- to_shape(c(1, self$dim))
# # TF1/2 check
# # need to change the placeholder approach here.
# # NT: can we change this to be a tensor of the right shape with 1s?
# tensor <- tensorflow::as_tensor(1L, shape = shape, dtype = tf_float())
} else {
tensor <- dag$draw_sample(self$distribution)
}
Expand Down
Loading

0 comments on commit 80192ed

Please sign in to comment.