Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preferred way to deal with indexed coordinates? #107

Closed
kthyng opened this issue Oct 20, 2020 · 7 comments · Fixed by #174
Closed

Preferred way to deal with indexed coordinates? #107

kthyng opened this issue Oct 20, 2020 · 7 comments · Fixed by #174

Comments

@kthyng
Copy link
Contributor

kthyng commented Oct 20, 2020

What is the preferred way to deal with the following situation?

image

So, a specific time is chosen in the model output. In the first example, the time is retained in the DataArray as a coordinate that has no dimension (see next pic)

image

When the time is retained, the DataArray has a "T" axis picked up by cf-xarray. However, there isn't actually a dimension of time in this DataArray, so I would get an error if I tried to transpose the T array based on what is returned from cf.axes.

In the second case in the first pic, I can drop the coordinate that was indexed on with drop=True to avoid the "T" axis being identified. However, this is sort of annoying to need to make sure people do, and it then also loses the information of which time is being used.

Is there a reasonable around this in cf-xarray? Should an Axis or Coordinate not be identified if it has no shape/dimension/something?

@dcherian
Copy link
Contributor

I tried to transpose the T array based on what is returned from cf.axes

So ds.reset_coords(drop=True).cf.axes should work.

Should an Axis or Coordinate not be identified if it has no shape/dimension/something?

This (more general problem) is a little tricky. In this case, I would like ds.cf["T"] to work.

BUT I had a similar issue recently where there is a lon indexed dimension and a lon_rho non-coordinate variable. This means that .cf.sel(longitude=...) breaks. So for this kind of thing we'd want the _get_axis_coord mapper to only look at dimension names.

I am not sure on how to cleanly solve this.

@kthyng
Copy link
Contributor Author

kthyng commented Oct 21, 2020

I tried to transpose the T array based on what is returned from cf.axes

So ds.reset_coords(drop=True).cf.axes should work.

Ok this will do the job for me for now. Thanks for pointing it out.

@kthyng kthyng closed this as completed Oct 21, 2020
@dcherian dcherian reopened this Oct 21, 2020
@dcherian
Copy link
Contributor

Reopening for

BUT I had a similar issue recently where there is a lon indexed dimension and a lon_rho non-coordinate variable. This means that .cf.sel(longitude=...) breaks. So for this kind of thing we'd want the _get_axis_coord mapper to only look at dimension names.

@kthyng
Copy link
Contributor Author

kthyng commented Oct 22, 2020

I would also prefer that _get_axis_coord only consider dimension names.

@dcherian
Copy link
Contributor

I think the solution here is to refactor _get_axis_coord to

  1. _get_indexes : only look at .indexes.keys()
  2. _get_coords: only looks at .coords.keys()
  3. _get_dims: only looks at .dims

@dcherian
Copy link
Contributor

dcherian commented Feb 8, 2021

Actually now that most of the time we are using _get_axis_coord and _get_with_standard_name, I think we should refactor to

  1. _get_indexes : only look at .indexes.keys()
  2. _get_coords: only looks at .coords.keys()
  3. _get_dims: only looks at .dims

where each of those functions returns the intersection of .cf.keys and .dims or .coords or .indexes as appropriate.

Then we use those mappers appropriately ... .sel would use _get_indexes, .transpose would use _get_dims etc.

There's probably some outlier case where two functions use the same kwarg name but one needs dims but the other needs coords. If so, we can fix this by explicitly adding those functions to the accessor.

@dcherian
Copy link
Contributor

dcherian commented Feb 21, 2021

xref #174 (comment)

Another fundamental issue is whether .axes et al should returns results present only in coordinates or in all variables. With the methods, we can control this by using _get_dims and friends. With the properties we don't really have a good way of adding any user control. That said, asking people to use set(obj.cf.axes) & set(obj.cf.dims) is really not too bad.

Maybe all properties should return names present in all variables (not just data variables), and we can add an FAQ showing people how to use set operations to pull out what's needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants