Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

description of xarray assumes knowledge of pandas #1282

Closed
rabernat opened this issue Feb 22, 2017 · 4 comments
Closed

description of xarray assumes knowledge of pandas #1282

rabernat opened this issue Feb 22, 2017 · 4 comments

Comments

@rabernat
Copy link
Contributor

The first sentence a potential new user reads about xarray is

xarray (formerly xray) is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures.

Now imagine you had never heard of pandas (like most new Ph.D. students in physical sciences). You would have no idea how useful and powerful xarray was.

I would propose modifying these top-level descriptions to remove the assumption that the user understands pandas. Of course we can still refer to pandas, but a more self-contained description would serve us well.

@shoyer
Copy link
Member

shoyer commented Feb 23, 2017

Agreed!

Here's what @jhamman and I wrote in the abstract for our paper on Xarray:

Xarray is an open source project and Python package that provides a toolkit and data structures for N-dimensional labeled arrays. Our approach combines an application programming interface (API) inspired by pandas with the Common Data Model for self-described scientific data. Key features of the Xarray package include label-based indexing and arithmetic, interoperability with the core scientific Python packages (e.g., pandas, NumPy, Matplotlib), out-of-core computation on datasets that don't fit into memory, a wide range of serialization and input/output (I/O) options, and advanced multi-dimensional data manipulation tools such as group-by and resampling. Xarray, as a data model and analytics toolkit, has been widely adopted in the geoscience community but is also used more broadly for multi-dimensional data analysis in physics, machine learning and finance.

Probably something like that first sentence is a better high level description.

@byersiiasa
Copy link

I agree as I was in this situation of jumping straight into xarray (and Python) having never used pandas. As for other key points that could be emphasised:

  • , the concept of label-based indexing was new to me and may be something you may want to add more emphasis on in the Page 1 description? (I see it is already nicely explained in the paper in referecend to np.ndarrays.)
  • the automatic plotting with Matplotlib is super

@stale
Copy link

stale bot commented Feb 26, 2019

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

@stale stale bot added the stale label Feb 26, 2019
@rabernat
Copy link
Contributor Author

Closed by #2430 #2657.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants