Skip to content

Commit

Permalink
Merging upstream/master into my dev branch
Browse files Browse the repository at this point in the history
# Conflicts:
#	doc/source/whatsnew/v0.23.2.txt
  • Loading branch information
SaturninoMateus committed Jun 21, 2018
2 parents 53aca81 + 506935c commit 726755f
Show file tree
Hide file tree
Showing 67 changed files with 1,448 additions and 983 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -110,3 +110,4 @@ doc/source/styled.xlsx
doc/source/templates/
env/
doc/source/savefig/
*my-dev-test.py
10 changes: 7 additions & 3 deletions asv_bench/benchmarks/categoricals.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,11 @@ class Contains(object):
def setup(self):
N = 10**5
self.ci = tm.makeCategoricalIndex(N)
self.cat = self.ci.categories[0]
self.c = self.ci.values
self.key = self.ci.categories[0]

def time_contains(self):
self.cat in self.ci
def time_categorical_index_contains(self):
self.key in self.ci

def time_categorical_contains(self):
self.key in self.c
6 changes: 3 additions & 3 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1200,9 +1200,9 @@ Attributes and underlying data
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Axes**

* **items**: axis 0; each item corresponds to a DataFrame contained inside
* **major_axis**: axis 1; the index (rows) of each of the DataFrames
* **minor_axis**: axis 2; the columns of each of the DataFrames
* **items**: axis 0; each item corresponds to a DataFrame contained inside
* **major_axis**: axis 1; the index (rows) of each of the DataFrames
* **minor_axis**: axis 2; the columns of each of the DataFrames

.. autosummary::
:toctree: generated/
Expand Down
43 changes: 21 additions & 22 deletions doc/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,8 @@ Attributes and the raw ndarray(s)

pandas objects have a number of attributes enabling you to access the metadata

* **shape**: gives the axis dimensions of the object, consistent with ndarray
* Axis labels

* **shape**: gives the axis dimensions of the object, consistent with ndarray
* Axis labels
* **Series**: *index* (only axis)
* **DataFrame**: *index* (rows) and *columns*
* **Panel**: *items*, *major_axis*, and *minor_axis*
Expand Down Expand Up @@ -131,9 +130,9 @@ Flexible binary operations
With binary operations between pandas data structures, there are two key points
of interest:

* Broadcasting behavior between higher- (e.g. DataFrame) and
lower-dimensional (e.g. Series) objects.
* Missing data in computations.
* Broadcasting behavior between higher- (e.g. DataFrame) and
lower-dimensional (e.g. Series) objects.
* Missing data in computations.

We will demonstrate how to manage these issues independently, though they can
be handled simultaneously.
Expand Down Expand Up @@ -462,10 +461,10 @@ produce an object of the same size. Generally speaking, these methods take an
**axis** argument, just like *ndarray.{sum, std, ...}*, but the axis can be
specified by name or integer:

- **Series**: no axis argument needed
- **DataFrame**: "index" (axis=0, default), "columns" (axis=1)
- **Panel**: "items" (axis=0), "major" (axis=1, default), "minor"
(axis=2)
* **Series**: no axis argument needed
* **DataFrame**: "index" (axis=0, default), "columns" (axis=1)
* **Panel**: "items" (axis=0), "major" (axis=1, default), "minor"
(axis=2)

For example:

Expand Down Expand Up @@ -1187,11 +1186,11 @@ It is used to implement nearly all other features relying on label-alignment
functionality. To *reindex* means to conform the data to match a given set of
labels along a particular axis. This accomplishes several things:

* Reorders the existing data to match a new set of labels
* Inserts missing value (NA) markers in label locations where no data for
that label existed
* If specified, **fill** data for missing labels using logic (highly relevant
to working with time series data)
* Reorders the existing data to match a new set of labels
* Inserts missing value (NA) markers in label locations where no data for
that label existed
* If specified, **fill** data for missing labels using logic (highly relevant
to working with time series data)

Here is a simple example:

Expand Down Expand Up @@ -1911,10 +1910,10 @@ the axis indexes, since they are immutable) and returns a new object. Note that
**it is seldom necessary to copy objects**. For example, there are only a
handful of ways to alter a DataFrame *in-place*:

* Inserting, deleting, or modifying a column.
* Assigning to the ``index`` or ``columns`` attributes.
* For homogeneous data, directly modifying the values via the ``values``
attribute or advanced indexing.
* Inserting, deleting, or modifying a column.
* Assigning to the ``index`` or ``columns`` attributes.
* For homogeneous data, directly modifying the values via the ``values``
attribute or advanced indexing.

To be clear, no pandas method has the side effect of modifying your data;
almost every method returns a new object, leaving the original object
Expand Down Expand Up @@ -2112,22 +2111,22 @@ Because the data was transposed the original inference stored all columns as obj
The following functions are available for one dimensional object arrays or scalars to perform
hard conversion of objects to a specified type:

- :meth:`~pandas.to_numeric` (conversion to numeric dtypes)
* :meth:`~pandas.to_numeric` (conversion to numeric dtypes)

.. ipython:: python
m = ['1.1', 2, 3]
pd.to_numeric(m)
- :meth:`~pandas.to_datetime` (conversion to datetime objects)
* :meth:`~pandas.to_datetime` (conversion to datetime objects)

.. ipython:: python
import datetime
m = ['2016-07-09', datetime.datetime(2016, 3, 2)]
pd.to_datetime(m)
- :meth:`~pandas.to_timedelta` (conversion to timedelta objects)
* :meth:`~pandas.to_timedelta` (conversion to timedelta objects)

.. ipython:: python
Expand Down
10 changes: 5 additions & 5 deletions doc/source/categorical.rst
Original file line number Diff line number Diff line change
Expand Up @@ -542,11 +542,11 @@ Comparisons

Comparing categorical data with other objects is possible in three cases:

* Comparing equality (``==`` and ``!=``) to a list-like object (list, Series, array,
...) of the same length as the categorical data.
* All comparisons (``==``, ``!=``, ``>``, ``>=``, ``<``, and ``<=``) of categorical data to
another categorical Series, when ``ordered==True`` and the `categories` are the same.
* All comparisons of a categorical data to a scalar.
* Comparing equality (``==`` and ``!=``) to a list-like object (list, Series, array,
...) of the same length as the categorical data.
* All comparisons (``==``, ``!=``, ``>``, ``>=``, ``<``, and ``<=``) of categorical data to
another categorical Series, when ``ordered==True`` and the `categories` are the same.
* All comparisons of a categorical data to a scalar.

All other comparisons, especially "non-equality" comparisons of two categoricals with different
categories or a categorical with any list-like object, will raise a ``TypeError``.
Expand Down
10 changes: 5 additions & 5 deletions doc/source/comparison_with_r.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@ was started to provide a more detailed look at the `R language
party libraries as they relate to ``pandas``. In comparisons with R and CRAN
libraries, we care about the following things:

- **Functionality / flexibility**: what can/cannot be done with each tool
- **Performance**: how fast are operations. Hard numbers/benchmarks are
preferable
- **Ease-of-use**: Is one tool easier/harder to use (you may have to be
the judge of this, given side-by-side code comparisons)
* **Functionality / flexibility**: what can/cannot be done with each tool
* **Performance**: how fast are operations. Hard numbers/benchmarks are
preferable
* **Ease-of-use**: Is one tool easier/harder to use (you may have to be
the judge of this, given side-by-side code comparisons)

This page is also here to offer a bit of a translation guide for users of these
R packages.
Expand Down
46 changes: 23 additions & 23 deletions doc/source/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -344,20 +344,20 @@ The weights used in the window are specified by the ``win_type`` keyword.
The list of recognized types are the `scipy.signal window functions
<https://docs.scipy.org/doc/scipy/reference/signal.html#window-functions>`__:

- ``boxcar``
- ``triang``
- ``blackman``
- ``hamming``
- ``bartlett``
- ``parzen``
- ``bohman``
- ``blackmanharris``
- ``nuttall``
- ``barthann``
- ``kaiser`` (needs beta)
- ``gaussian`` (needs std)
- ``general_gaussian`` (needs power, width)
- ``slepian`` (needs width).
* ``boxcar``
* ``triang``
* ``blackman``
* ``hamming``
* ``bartlett``
* ``parzen``
* ``bohman``
* ``blackmanharris``
* ``nuttall``
* ``barthann``
* ``kaiser`` (needs beta)
* ``gaussian`` (needs std)
* ``general_gaussian`` (needs power, width)
* ``slepian`` (needs width).

.. ipython:: python
Expand Down Expand Up @@ -537,10 +537,10 @@ Binary Window Functions
two ``Series`` or any combination of ``DataFrame/Series`` or
``DataFrame/DataFrame``. Here is the behavior in each case:

- two ``Series``: compute the statistic for the pairing.
- ``DataFrame/Series``: compute the statistics for each column of the DataFrame
* two ``Series``: compute the statistic for the pairing.
* ``DataFrame/Series``: compute the statistics for each column of the DataFrame
with the passed Series, thus returning a DataFrame.
- ``DataFrame/DataFrame``: by default compute the statistic for matching column
* ``DataFrame/DataFrame``: by default compute the statistic for matching column
names, returning a DataFrame. If the keyword argument ``pairwise=True`` is
passed then computes the statistic for each pair of columns, returning a
``MultiIndexed DataFrame`` whose ``index`` are the dates in question (see :ref:`the next section
Expand Down Expand Up @@ -741,10 +741,10 @@ Aside from not having a ``window`` parameter, these functions have the same
interfaces as their ``.rolling`` counterparts. Like above, the parameters they
all accept are:

- ``min_periods``: threshold of non-null data points to require. Defaults to
* ``min_periods``: threshold of non-null data points to require. Defaults to
minimum needed to compute statistic. No ``NaNs`` will be output once
``min_periods`` non-null data points have been seen.
- ``center``: boolean, whether to set the labels at the center (default is False).
* ``center``: boolean, whether to set the labels at the center (default is False).

.. _stats.moments.expanding.note:
.. note::
Expand Down Expand Up @@ -903,12 +903,12 @@ of an EW moment:
One must specify precisely one of **span**, **center of mass**, **half-life**
and **alpha** to the EW functions:

- **Span** corresponds to what is commonly called an "N-day EW moving average".
- **Center of mass** has a more physical interpretation and can be thought of
* **Span** corresponds to what is commonly called an "N-day EW moving average".
* **Center of mass** has a more physical interpretation and can be thought of
in terms of span: :math:`c = (s - 1) / 2`.
- **Half-life** is the period of time for the exponential weight to reduce to
* **Half-life** is the period of time for the exponential weight to reduce to
one half.
- **Alpha** specifies the smoothing factor directly.
* **Alpha** specifies the smoothing factor directly.

Here is an example for a univariate time series:

Expand Down
70 changes: 35 additions & 35 deletions doc/source/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,11 +138,11 @@ steps; you only need to install the compiler.

For Windows developers, the following links may be helpful.

- https://blogs.msdn.microsoft.com/pythonengineering/2016/04/11/unable-to-find-vcvarsall-bat/
- https://github.com/conda/conda-recipes/wiki/Building-from-Source-on-Windows-32-bit-and-64-bit
- https://cowboyprogrammer.org/building-python-wheels-for-windows/
- https://blog.ionelmc.ro/2014/12/21/compiling-python-extensions-on-windows/
- https://support.enthought.com/hc/en-us/articles/204469260-Building-Python-extensions-with-Canopy
* https://blogs.msdn.microsoft.com/pythonengineering/2016/04/11/unable-to-find-vcvarsall-bat/
* https://github.com/conda/conda-recipes/wiki/Building-from-Source-on-Windows-32-bit-and-64-bit
* https://cowboyprogrammer.org/building-python-wheels-for-windows/
* https://blog.ionelmc.ro/2014/12/21/compiling-python-extensions-on-windows/
* https://support.enthought.com/hc/en-us/articles/204469260-Building-Python-extensions-with-Canopy

Let us know if you have any difficulties by opening an issue or reaching out on
`Gitter`_.
Expand All @@ -155,11 +155,11 @@ Creating a Python Environment
Now that you have a C compiler, create an isolated pandas development
environment:

- Install either `Anaconda <https://www.anaconda.com/download/>`_ or `miniconda
* Install either `Anaconda <https://www.anaconda.com/download/>`_ or `miniconda
<https://conda.io/miniconda.html>`_
- Make sure your conda is up to date (``conda update conda``)
- Make sure that you have :ref:`cloned the repository <contributing.forking>`
- ``cd`` to the *pandas* source directory
* Make sure your conda is up to date (``conda update conda``)
* Make sure that you have :ref:`cloned the repository <contributing.forking>`
* ``cd`` to the *pandas* source directory

We'll now kick off a three-step process:

Expand Down Expand Up @@ -286,15 +286,15 @@ complex changes to the documentation as well.

Some other important things to know about the docs:

- The *pandas* documentation consists of two parts: the docstrings in the code
* The *pandas* documentation consists of two parts: the docstrings in the code
itself and the docs in this folder ``pandas/doc/``.

The docstrings provide a clear explanation of the usage of the individual
functions, while the documentation in this folder consists of tutorial-like
overviews per topic together with some other information (what's new,
installation, etc).

- The docstrings follow a pandas convention, based on the **Numpy Docstring
* The docstrings follow a pandas convention, based on the **Numpy Docstring
Standard**. Follow the :ref:`pandas docstring guide <docstring>` for detailed
instructions on how to write a correct docstring.

Expand All @@ -303,7 +303,7 @@ Some other important things to know about the docs:

contributing_docstring.rst

- The tutorials make heavy use of the `ipython directive
* The tutorials make heavy use of the `ipython directive
<http://matplotlib.org/sampledoc/ipython_directive.html>`_ sphinx extension.
This directive lets you put code in the documentation which will be run
during the doc build. For example::
Expand All @@ -324,7 +324,7 @@ Some other important things to know about the docs:
doc build. This approach means that code examples will always be up to date,
but it does make the doc building a bit more complex.

- Our API documentation in ``doc/source/api.rst`` houses the auto-generated
* Our API documentation in ``doc/source/api.rst`` houses the auto-generated
documentation from the docstrings. For classes, there are a few subtleties
around controlling which methods and attributes have pages auto-generated.

Expand Down Expand Up @@ -488,8 +488,8 @@ standard. Google provides an open source style checker called ``cpplint``, but w
use a fork of it that can be found `here <https://github.com/cpplint/cpplint>`__.
Here are *some* of the more common ``cpplint`` issues:

- we restrict line-length to 80 characters to promote readability
- every header file must include a header guard to avoid name collisions if re-included
* we restrict line-length to 80 characters to promote readability
* every header file must include a header guard to avoid name collisions if re-included

:ref:`Continuous Integration <contributing.ci>` will run the
`cpplint <https://pypi.org/project/cpplint>`_ tool
Expand Down Expand Up @@ -536,8 +536,8 @@ Python (PEP8)
There are several tools to ensure you abide by this standard. Here are *some* of
the more common ``PEP8`` issues:

- we restrict line-length to 79 characters to promote readability
- passing arguments should have spaces after commas, e.g. ``foo(arg1, arg2, kw1='bar')``
* we restrict line-length to 79 characters to promote readability
* passing arguments should have spaces after commas, e.g. ``foo(arg1, arg2, kw1='bar')``

:ref:`Continuous Integration <contributing.ci>` will run
the `flake8 <https://pypi.org/project/flake8>`_ tool
Expand Down Expand Up @@ -715,14 +715,14 @@ Using ``pytest``

Here is an example of a self-contained set of tests that illustrate multiple features that we like to use.

- functional style: tests are like ``test_*`` and *only* take arguments that are either fixtures or parameters
- ``pytest.mark`` can be used to set metadata on test functions, e.g. ``skip`` or ``xfail``.
- using ``parametrize``: allow testing of multiple cases
- to set a mark on a parameter, ``pytest.param(..., marks=...)`` syntax should be used
- ``fixture``, code for object construction, on a per-test basis
- using bare ``assert`` for scalars and truth-testing
- ``tm.assert_series_equal`` (and its counter part ``tm.assert_frame_equal``), for pandas object comparisons.
- the typical pattern of constructing an ``expected`` and comparing versus the ``result``
* functional style: tests are like ``test_*`` and *only* take arguments that are either fixtures or parameters
* ``pytest.mark`` can be used to set metadata on test functions, e.g. ``skip`` or ``xfail``.
* using ``parametrize``: allow testing of multiple cases
* to set a mark on a parameter, ``pytest.param(..., marks=...)`` syntax should be used
* ``fixture``, code for object construction, on a per-test basis
* using bare ``assert`` for scalars and truth-testing
* ``tm.assert_series_equal`` (and its counter part ``tm.assert_frame_equal``), for pandas object comparisons.
* the typical pattern of constructing an ``expected`` and comparing versus the ``result``

We would name this file ``test_cool_feature.py`` and put in an appropriate place in the ``pandas/tests/`` structure.

Expand Down Expand Up @@ -969,21 +969,21 @@ Finally, commit your changes to your local repository with an explanatory messag
uses a convention for commit message prefixes and layout. Here are
some common prefixes along with general guidelines for when to use them:
* ENH: Enhancement, new functionality
* BUG: Bug fix
* DOC: Additions/updates to documentation
* TST: Additions/updates to tests
* BLD: Updates to the build process/scripts
* PERF: Performance improvement
* CLN: Code cleanup
* ENH: Enhancement, new functionality
* BUG: Bug fix
* DOC: Additions/updates to documentation
* TST: Additions/updates to tests
* BLD: Updates to the build process/scripts
* PERF: Performance improvement
* CLN: Code cleanup
The following defines how a commit message should be structured. Please reference the
relevant GitHub issues in your commit message using GH1234 or #1234. Either style
is fine, but the former is generally preferred:
* a subject line with `< 80` chars.
* One blank line.
* Optionally, a commit message body.
* a subject line with `< 80` chars.
* One blank line.
* Optionally, a commit message body.
Now you can commit your changes in your local repository::
Expand Down
Loading

0 comments on commit 726755f

Please sign in to comment.