Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing attributes with length 0: buf parameter can't be NULL #208

Closed
rcannood opened this issue Dec 7, 2023 · 16 comments
Closed

Writing attributes with length 0: buf parameter can't be NULL #208

rcannood opened this issue Dec 7, 2023 · 16 comments

Comments

@rcannood
Copy link

rcannood commented Dec 7, 2023

Hi @hhoeflin ! Thanks again for creating and maintaining this incredibly useful package.

I'm having a few issues reading and writing attributes with arrays of length zero. Unfortunately this is quite blocking for a project I'm working on.

Here is a zip file of the h5ad file used in the examples below: example.zip

This is similar though I think unrelated to #118.

Any help in this matter would be much appreciated! I'll look through the hdf5r code to see if there is anything I can do to solve the problem.

Reading an attribute containing an empty array

When I try to read an attribute containing an empty array, I get an error message:

library(hdf5r)

file <- H5File$new("inst/extdata/example.h5ad", "r")

## reading empty attributes

h5attr(file[["obs"]], "column-order")

#  [1] "Float"                   "FloatNA"                
#  [3] "Int"                     "IntNA"                  
#  [5] "Bool"                    "BoolNA"                 
#  [7] "n_genes_by_counts"       "log1p_n_genes_by_counts"
#  [9] "total_counts"            "log1p_total_counts"     
# [11] "leiden"    

h5attr(file[["uns/DataFrameEmpty"]], "column-order")

# Error in attr_obj$read() : HDF5-API Errors:
#     error #000: ../../src/H5A.c in H5Aread(): line 702: buf parameter can't be NULL
#         class: HDF5
#         major: Invalid arguments to routine
#         minor: Bad value

file$close_all()

In comparison, this does work in h5py:

import h5py

## reading empty attributes
with h5py.File("inst/extdata/example.h5ad", "r") as f:
  nonempty = f["obs"].attrs["column-order"]
  print(f"Non-empty attribute: {nonempty}")
  uns_column_order = f["uns/DataFrameEmpty"].attrs["column-order"]
  print(f"Empty attribute: {uns_column_order}")

# Non-empty attribute: ['Float' 'FloatNA' 'Int' 'IntNA' 'Bool' 'BoolNA' 'n_genes_by_counts'
#  'log1p_n_genes_by_counts' 'total_counts' 'log1p_total_counts' 'leiden']
# Empty attribute: []

Writing an attribute containing an empty array

I also can't write empty arrays as attributes:

library(hdf5r)

file <- H5File$new("test.h5ad", "w")

h5attr(file, "nonempty") <- c("a", "b", "c")

h5attr(file, "nonempty")
# [1] "a" "b" "c"

h5attr(file, "empty") <- character(0)
# Error in attr$write(robj) : HDF5-API Errors:
#     error #000: ../../src/H5A.c in H5Awrite(): line 657: buf parameter can't be NULL
#         class: HDF5
#         major: Invalid arguments to routine
#         minor: Bad value

In Python, this does work:

## writing empty attributes
with h5py.File("test.h5ad", "w") as f:
  f.attrs["nonempty"] = ["a", "b", "c"]
  f.attrs["empty"] = []

  print(f"Non-empty attribute: {f.attrs['nonempty']}")
  print(f"Empty attribute: {f.attrs['empty']}")

# Non-empty attribute: ['a' 'b' 'c']
# Empty attribute: []
rcannood added a commit to rcannood/hdf5r that referenced this issue Dec 7, 2023
@rcannood
Copy link
Author

I created a PR (#209) which contains a unit test for this edge case.

@rcannood
Copy link
Author

I think the issue occurs whenever there's a function from the HDF5 library that expects a buffer.

  if(XLENGTH(R_buf) == 0) {
    buf = NULL;
  }
  else {
    buf = ...
  }
  // ...
  ... = H5SomeFunction(..., buf, ...);

It appears this buffer shouldn't be NULL (otherwise the error H5SomeFunction(): buf parameter can't be NULL is thrown).

Question is, with what should each of these buf = NULL be replaced? I've had some success by replacing buf = NULL; with buf = R_alloc(1, 1); // allocate 1 byte, but that causes other problems down the road.

@rcannood
Copy link
Author

I managed to solve the issue by replacing the XLENGTH(.) == 0 condition with TYPEOF(.) == NILSXP:

  const void* buf;
  if(TYPEOF(R_buf) == NILSXP) {
    buf = NULL;
  }
  else {
    buf = (void *) VOIDPTR(R_buf);
  }

I created PR #211 which allows the unit tests I added in #209 to pass. It should be noted that I only performed the substitution only where needed for allowing the tests in #209 to pass.

@hhoeflin
Copy link
Owner

sorry for the late reply. Will try to look at the fix over the Christmas break and thanks for the PR.

@rcannood
Copy link
Author

rcannood commented Jan 4, 2024

No worries and thanks! Is there anything I can help out with? Would it help to schedule a call?

@hhoeflin
Copy link
Owner

hhoeflin commented Jan 7, 2024

just a quick update. I had an initial look at this, but I need to dig deeper. A purely technical issue is that the places where you made the fix all need to be moved. The reason is that this code is auto-generated, so the way the code is generated has to be adapted upstream. As that code is somewhat ... ugly, I need to do that adaptation on my end.

Stay tuned

hhoeflin pushed a commit that referenced this issue Jan 7, 2024
* add test for issue #208

* empty commit

* Fix test: empty data frame should still contain some columns
@rcannood
Copy link
Author

rcannood commented Jan 7, 2024

Thanks for looking into this!

this code is auto-generated

It's good to hear that the code is auto-generated. I couldn't imagine maintaining this code base across different versions of HDF5 library.

Out of curiosity, is the script that generates the code in a public repository?

hhoeflin pushed a commit that referenced this issue Jan 7, 2024
* add test for issue #208

* empty commit

* Fix test: empty data frame should still contain some columns
@hhoeflin
Copy link
Owner

hhoeflin commented Jan 7, 2024

Fixed in #214.

You can look for the for the code generates the wrapper files under /inst/CWrappers... but beware, it is not pretty.

@hhoeflin hhoeflin closed this as completed Jan 7, 2024
@rcannood
Copy link
Author

rcannood commented Jan 8, 2024

Aha, that's what the patches are for! Makes sense :)

Thanks for fixing the issue!

hhoeflin pushed a commit that referenced this issue Jan 14, 2024
* add test for issue #208

* empty commit

* Fix test: empty data frame should still contain some columns
@hhoeflin hhoeflin reopened this Jan 14, 2024
@hhoeflin
Copy link
Owner

@rcannood : I had to revert the changes for now and reopen the issue. The hdf5r.Extra package is causing a reverse dependency error.

ycli1995/hdf5r.Extra#1

The code for the fix is located in the branch

https://github.com/hhoeflin/hdf5r/tree/bug/fix_empty_attrs_again

@rcannood
Copy link
Author

It seems @ycli1995 fixed ycli1995/hdf5r.Extra#1 ☺️

Could we get this fix merged and released? 🙇

@rcannood
Copy link
Author

Hi @hhoeflin!

I ran a revdepcheck on this branch:

> revdepcheck::revdep_check(timeout = as.difftime(600, units = "mins"), num_workers = 8)
── CHECK ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 13 packages ──
✔ readNSx 0.0.2                          ── E: 0     | W: 0     | N: 0          
✔ hdf5r.Extra 0.0.5                      ── E: 0     | W: 0     | N: 0          
✔ rblt 0.2.4.6                           ── E: 0     | W: 0     | N: 0          
✔ dynutils 1.0.11                        ── E: 0     | W: 0     | N: 0   

I didn't let it finish completely, but I can already confirm that this PR does not introduce new errors to hdf5r.Extra anymore.

Would you be able to merge the PR? 🙇

@hhoeflin
Copy link
Owner

hhoeflin commented Feb 10, 2024 via email

@rcannood
Copy link
Author

Awesome, thanks!

@rcannood
Copy link
Author

Hi @hhoeflin !

I see that branch bug/fix_empty_attrs_again has not yet been merged.

Might I inquire about the status of this fix?

The described issue has been a blocking issue for months on end now, and it's saddening to see that a fix for it exists but is simply not being merged / released.

Happy to help out with whatever you would need help with!

Kind regards,
Robrecht

hhoeflin pushed a commit that referenced this issue Jul 7, 2024
* add test for issue #208

* empty commit

* Fix test: empty data frame should still contain some columns
hhoeflin pushed a commit that referenced this issue Jul 7, 2024
* add test for issue #208

* empty commit

* Fix test: empty data frame should still contain some columns
@rcannood
Copy link
Author

Hey @hhoeflin !

Thanks for releasing hdf5r 1.3.11! 🙇 🙇 🙇

I can confirm this has fixed my issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants