Skip to content

Commit

Permalink
GH-38084: [R] Do not memory map when explicitly checking for file rem…
Browse files Browse the repository at this point in the history
…oval (#38085)

### Rationale for this change

We have failing CI on Windows because removing files that have memory mapped sections is not allowed.

### What changes are included in this PR?

Pass `mmap = FALSE` when we explicitly check for removal. I think the other cases are OK because `unlink()` fails silently (which is maybe why we haven't seen mass CI failures because of this issue before).

### Are these changes tested?

The changes are covered by existing tests.

### Are there any user-facing changes?

No.

* Closes: #38084

Authored-by: Dewey Dunnington <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
  • Loading branch information
paleolimbot authored Oct 6, 2023
1 parent 629ecbd commit f525b99
Show file tree
Hide file tree
Showing 4 changed files with 21 additions and 10 deletions.
6 changes: 4 additions & 2 deletions r/R/parquet.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
#'
#' @inheritParams read_feather
#' @param props [ParquetArrowReaderProperties]
#' @param mmap Use TRUE to use memory mapping where possible
#' @param ... Additional arguments passed to `ParquetFileReader$create()`
#'
#' @return A `tibble` if `as_data_frame` is `TRUE` (the default), or an
Expand All @@ -43,14 +44,15 @@ read_parquet <- function(file,
# Assembling `props` yourself is something you do with
# ParquetFileReader but not here.
props = ParquetArrowReaderProperties$create(),
mmap = TRUE,
...) {
if (!inherits(file, "RandomAccessFile")) {
# Compression is handled inside the parquet file format, so we don't need
# to detect from the file extension and wrap in a CompressedInputStream
file <- make_readable_file(file)
file <- make_readable_file(file, mmap = mmap)
on.exit(file$close())
}
reader <- ParquetFileReader$create(file, props = props, ...)
reader <- ParquetFileReader$create(file, props = props, mmap = mmap, ...)

col_select <- enquo(col_select)
if (!quo_is_null(col_select)) {
Expand Down
3 changes: 3 additions & 0 deletions r/man/read_parquet.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 13 additions & 7 deletions r/tests/testthat/test-parquet.R
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ test_that("simple int column roundtrip", {
pq_tmp_file <- tempfile() # You can specify the .parquet here but that's probably not necessary

write_parquet(df, pq_tmp_file)
df_read <- read_parquet(pq_tmp_file)
df_read <- read_parquet(pq_tmp_file, mmap = FALSE)
expect_equal(df, df_read)
# Make sure file connection is cleaned up
expect_error(file.remove(pq_tmp_file), NA)
Expand Down Expand Up @@ -288,15 +288,21 @@ test_that("write_parquet() returns its input", {

test_that("write_parquet() handles version argument", {
df <- tibble::tibble(x = 1:5)
tf <- tempfile()
on.exit(unlink(tf))

purrr::walk(list("1.0", "2.4", "2.6", "latest", 1.0, 2.4, 2.6, 1L), ~ {
write_parquet(df, tf, version = .x)
versions <- list("1.0", "2.4", "2.6", "latest", 1.0, 2.4, 2.6, 1L)
purrr::walk(versions, function(x) {
tf <- tempfile()
on.exit(unlink(tf))

write_parquet(df, tf, version = x)
expect_identical(read_parquet(tf), df)
})
purrr::walk(list("3.0", 3.0, 3L, "A"), ~ {
expect_error(write_parquet(df, tf, version = .x))

invalid_versions <- list("3.0", 3.0, 3L, "A")
purrr::walk(invalid_versions, function(x) {
tf <- tempfile()
on.exit(unlink(tf))
expect_error(write_parquet(df, tf, version = x))
})
})

Expand Down
2 changes: 1 addition & 1 deletion r/tests/testthat/test-read-record-batch.R
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ test_that("RecordBatchFileWriter / RecordBatchFileReader roundtrips", {
writer$close()
stream$close()

expect_equal(read_feather(tf, as_data_frame = FALSE), tab)
expect_equal(read_feather(tf, as_data_frame = FALSE, mmap = FALSE), tab)
# Make sure connections are closed
expect_error(file.remove(tf), NA)
skip_on_os("windows") # This should pass, we've closed the stream
Expand Down

0 comments on commit f525b99

Please sign in to comment.