Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orthogonal selection #108

Merged
merged 38 commits into from
Oct 24, 2024
Merged

Orthogonal selection #108

merged 38 commits into from
Oct 24, 2024

Conversation

Artur-man
Copy link
Contributor

@Artur-man Artur-man commented Sep 3, 2024

Summary:

  • OrthogonalIndexer and IntArrayDimIndexer
  • Order class from zarr-python
  • IntDimIndexer (may need some work)
    • UPDATE should be fully function now under BasicIndexer
  • Future work: BoolArrayDimIndexer
  • update normalization for integer vectors
  • add methods for int and zb_int (are these needed?), currently implemented but not used ...
    • UPDATE int and zb_int are now fully implemented!
  • zero_based_to_one_based function now also applies to none slice selections
  • [.ZarrArray triggers orthogonal selection only (or should it?)
  • add tests for orthogonal selection

Relevant Issues: #107 #104 #97 #43

@Artur-man
Copy link
Contributor Author

Artur-man commented Sep 3, 2024

Here it is, tested a bunch of times but who knows :D Let me know what you guys think and lets review if you want.

@Artur-man
Copy link
Contributor Author

Artur-man commented Sep 7, 2024

Commented out exported functions under int.R to let pkgdown do its thing ... although I am not sure if int() or zb_int() are indeed needed.

Copy link
Owner

@keller-mark keller-mark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Artur-man thank you very much for the PR. I had some comments about additional tests to add for the new internal classes. Since they contain indexing arithmetic it would be great to have unit tests to at least ensure there are no regressions.

expect_error(z[1], "This Zarr object has 2 dimensions, 1 were supplied")

# test `[.ZarrArray` against get_item
expect_equal(z[1:2,5], z$get_item(list(slice(1, 3), slice(5, 5))))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this line was previously

expect_equal(z[1:3,5], z$get_item(list(slice(1, 3), slice(5, 5))))

Did the behavior change?

Copy link
Contributor Author

@Artur-man Artur-man Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I have been meaning to discuss this, I think 1:2 and 1:3 have the same result here, but I guess I was experimenting with how to treat :.

Note that the shape of a is (2,10) and I think slice(1,3) would call python-based indices (0,1,2) which are the first three rows but I think slice doesn't care if indices are outside of dimension limits ?

> z$get_item(list(slice(1, 10), slice(5, 5)))$data
     [,1]
[1,]    9
[2,]   10
> z$get_item(list(slice(1, 20), slice(5, 5)))$data
     [,1]
[1,]    9
[2,]   10

As a remark all dimensional indices with a:b would be converted to slice(a,b), and this is still the case with manage_filters. See below:

pizzarr/R/zarr-array.R

Lines 1231 to 1235 in f84355d

} else if(typeof(x) == "language") {
x <- as.list(x)
# Return a range (supplied via : or seq())
if(x[[1]] == ":") {
return(slice(x[[2]], x[[3]]))

A typical R-based indexing with a:b and [ method would fail in this situation. Lets see what happens when we incorporate the same indexing without/with ZarrArray.

# with ZarrArray
> z <- zarr_create(shape=dim(a), dtype="<f4", fill_value=NA)
> z$set_item("...", a)
> z[1:3,5]$data
     [,1]
[1,]    9
[2,]   10

# without ZarrArray
> a <- array(data=1:20, dim=c(2, 10))
> a[1:3,5]
Error in a[1:3, 5] : subscript out of bounds

It is up to you @keller-mark, how would you like [ to behave.

@Artur-man Artur-man mentioned this pull request Sep 10, 2024
@Artur-man
Copy link
Contributor Author

Artur-man commented Oct 20, 2024

I had to do some internal changes to is_integer and is_integer_vec 3fa0a47

The previous implementation of is_integer was not behaving as the one in zarr-python. This is because R also returns TRUE for full integer vectors when is_integer is called, thus one should check the length too.

Thus now the function checks the length as well, similar modifications are done to is_integer_vec as well. Now the Indexer classes are more comparable to the zarr-python.

@Artur-man
Copy link
Contributor Author

pkgdown should be fixed now

@keller-mark
Copy link
Owner

Hi @Artur-man this is looking great, thank you for the tests and the comments and the references back to the python code.

I just opened Artur-man#1, can you merge that into your branch, and then I will merge this PR?

@Artur-man
Copy link
Contributor Author

just did!!

@keller-mark keller-mark merged commit 67359dd into keller-mark:main Oct 24, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants