What's new in 1.3.0 (??)

These are the changes in pandas 1.3.0. See :ref:`release` for a full changelog including other versions of pandas.

Warning

When reading new Excel 2007+ (.xlsx) files, the default argument engine=None to :func:`~pandas.read_excel` will now result in using the openpyxl engine in all cases when the option :attr:`io.excel.xlsx.reader` is set to "auto". Previously, some cases would use the xlrd engine instead. See :ref:`What's new 1.2.0 <whatsnew_120>` for background on this change.

Enhancements

Custom HTTP(s) headers when reading csv or json files

When reading from a remote URL that is not handled by fsspec (ie. HTTP and HTTPS) the dictionary passed to storage_options will be used to create the headers included in the request. This can be used to control the User-Agent header or send other custom headers (:issue:`36688`). For example:

.. ipython:: python

    headers = {"User-Agent": "pandas"}
    df = pd.read_csv(
        "https://download.bls.gov/pub/time.series/cu/cu.item",
        sep="\t",
        storage_options=headers
    )

Read and write XML documents

We added I/O support to read and render shallow versions of XML documents with :func:`pandas.read_xml` and :meth:`DataFrame.to_xml`. Using lxml as parser, both XPath 1.0 and XSLT 1.0 is available. (:issue:`27554`)

In [1]: xml = """<?xml version='1.0' encoding='utf-8'?>
   ...: <data>
   ...:  <row>
   ...:     <shape>square</shape>
   ...:     <degrees>360</degrees>
   ...:     <sides>4.0</sides>
   ...:  </row>
   ...:  <row>
   ...:     <shape>circle</shape>
   ...:     <degrees>360</degrees>
   ...:     <sides/>
   ...:  </row>
   ...:  <row>
   ...:     <shape>triangle</shape>
   ...:     <degrees>180</degrees>
   ...:     <sides>3.0</sides>
   ...:  </row>
   ...:  </data>"""

In [2]: df = pd.read_xml(xml)
In [3]: df
Out[3]:
      shape  degrees  sides
0    square      360    4.0
1    circle      360    NaN
2  triangle      180    3.0

In [4]: df.to_xml()
Out[4]:
<?xml version='1.0' encoding='utf-8'?>
<data>
  <row>
    <index>0</index>
    <shape>square</shape>
    <degrees>360</degrees>
    <sides>4.0</sides>
  </row>
  <row>
    <index>1</index>
    <shape>circle</shape>
    <degrees>360</degrees>
    <sides/>
  </row>
  <row>
    <index>2</index>
    <shape>triangle</shape>
    <degrees>180</degrees>
    <sides>3.0</sides>
  </row>
</data>

For more, see :ref:`io.xml` in the user guide on IO tools.

Styler Upgrades

We provided some focused development on :class:`.Styler`, including altering methods to accept more universal CSS language for arguments, such as 'color:red;' instead of [('color', 'red')] (:issue:`39564`). This is also added to the built-in methods to allow custom CSS highlighting instead of default background coloring (:issue:`40242`). Enhancements to other built-in methods include extending the :meth:`.Styler.background_gradient` method to shade elements based on a given gradient map and not be restricted only to values in the DataFrame (:issue:`39930` :issue:`22727` :issue:`28901`). Additional built-in methods such as :meth:`.Styler.highlight_between`, :meth:`.Styler.highlight_quantile` and .Styler.text_gradient have been added (:issue:`39821`, :issue:`40926`, :issue:`41098`).

The :meth:`.Styler.apply` now consistently allows functions with ndarray output to allow more flexible development of UDFs when axis is None 0 or 1 (:issue:`39393`).

:meth:`.Styler.set_tooltips` is a new method that allows adding on hover tooltips to enhance interactive displays (:issue:`35643`). :meth:`.Styler.set_td_classes`, which was recently introduced in v1.2.0 (:issue:`36159`) to allow adding specific CSS classes to data cells, has been made as performant as :meth:`.Styler.apply` and :meth:`.Styler.applymap` (:issue:`40453`), if not more performant in some cases. The overall performance of HTML render times has been considerably improved to match :meth:`DataFrame.to_html` (:issue:`39952` :issue:`37792` :issue:`40425`).

The :meth:`.Styler.format` has had upgrades to easily format missing data, precision, and perform HTML escaping (:issue:`40437` :issue:`40134`). There have been numerous other bug fixes to properly format HTML and eliminate some inconsistencies (:issue:`39942` :issue:`40356` :issue:`39807` :issue:`39889` :issue:`39627`)

:class:`.Styler` has also been compatible with non-unique index or columns, at least for as many features as are fully compatible, others made only partially compatible (:issue:`41269`). One also has greater control of the display through separate sparsification of the index or columns, using the new 'styler' options context (:issue:`41142`).

We have added an extension to allow LaTeX styling as an alternative to CSS styling and a method :meth:`.Styler.to_latex` which renders the necessary LaTeX format including built-up styles. An additional file io function :meth:`Styler.to_html` has been added for convenience (:issue:`40312`).

Documentation has also seen major revisions in light of new features (:issue:`39720` :issue:`39317` :issue:`40493`)

DataFrame constructor honors `copy=False` with dict

When passing a dictionary to :class:`DataFrame` with copy=False, a copy will no longer be made (:issue:`32960`)

.. ipython:: python

    arr = np.array([1, 2, 3])
    df = pd.DataFrame({"A": arr, "B": arr.copy()}, copy=False)
    df

df["A"] remains a view on arr:

.. ipython:: python

    arr[0] = 0
    assert df.iloc[0, 0] == 0

The default behavior when not passing copy will remain unchanged, i.e. a copy will be made.

Centered Datetime-Like Rolling Windows

When performing rolling calculations on :class:`DataFrame` and :class:`Series` objects with a datetime-like index, a centered datetime-like window can now be used (:issue:`38780`). For example:

.. ipython:: python

    df = pd.DataFrame(
        {"A": [0, 1, 2, 3, 4]}, index=pd.date_range("2020", periods=5, freq="1D")
    )
    df
    df.rolling("2D", center=True).mean()

Other enhancements

:class:`Rolling` and :class:`Expanding` now support a method argument with a 'table' option that performs the windowing operation over an entire :class:`DataFrame`. See ref:window.overview for performance and functional benefits (:issue:`15095`, :issue:`38995`)
Added :meth:`MultiIndex.dtypes` (:issue:`37062`)
Added end and end_day options for origin in :meth:`DataFrame.resample` (:issue:`37804`)
Improve error message when usecols and names do not match for :func:`read_csv` and engine="c" (:issue:`29042`)
Improved consistency of error message when passing an invalid win_type argument in :class:`Window` (:issue:`15969`)
:func:`pandas.read_sql_query` now accepts a dtype argument to cast the columnar data from the SQL database based on user input (:issue:`10285`)
Improved integer type mapping from pandas to SQLAlchemy when using :meth:`DataFrame.to_sql` (:issue:`35076`)
:func:`to_numeric` now supports downcasting of nullable ExtensionDtype objects (:issue:`33013`)
Add support for dict-like names in :class:`MultiIndex.set_names` and :class:`MultiIndex.rename` (:issue:`20421`)
:func:`pandas.read_excel` can now auto detect .xlsb files and older .xls files (:issue:`35416`, :issue:`41225`)
:class:`pandas.ExcelWriter` now accepts an if_sheet_exists parameter to control the behaviour of append mode when writing to existing sheets (:issue:`40230`)
:meth:`.Rolling.sum`, :meth:`.Expanding.sum`, :meth:`.Rolling.mean`, :meth:`.Expanding.mean`, :meth:`.ExponentialMovingWindow.mean`, :meth:`.Rolling.median`, :meth:`.Expanding.median`, :meth:`.Rolling.max`, :meth:`.Expanding.max`, :meth:`.Rolling.min`, and :meth:`.Expanding.min` now support Numba execution with the engine keyword (:issue:`38895`, :issue:`41267`)
:meth:`DataFrame.apply` can now accept NumPy unary operators as strings, e.g. df.apply("sqrt"), which was already the case for :meth:`Series.apply` (:issue:`39116`)
:meth:`DataFrame.apply` can now accept non-callable DataFrame properties as strings, e.g. df.apply("size"), which was already the case for :meth:`Series.apply` (:issue:`39116`)
:meth:`DataFrame.applymap` can now accept kwargs to pass on to func (:issue:`39987`)
Disallow :class:`DataFrame` indexer for iloc for :meth:`Series.__getitem__` and :meth:`DataFrame.__getitem__`, (:issue:`39004`)
:meth:`Series.apply` can now accept list-like or dictionary-like arguments that aren't lists or dictionaries, e.g. ser.apply(np.array(["sum", "mean"])), which was already the case for :meth:`DataFrame.apply` (:issue:`39140`)
:meth:`DataFrame.plot.scatter` can now accept a categorical column as the argument to c (:issue:`12380`, :issue:`31357`)
:meth:`.Styler.set_tooltips` allows on hover tooltips to be added to styled HTML dataframes (:issue:`35643`, :issue:`21266`, :issue:`39317`, :issue:`39708`, :issue:`40284`)
:meth:`.Styler.set_table_styles` amended to optionally allow certain css-string input arguments (:issue:`39564`)
:meth:`.Styler.apply` now more consistently accepts ndarray function returns, i.e. in all cases for axis is 0, 1 or None (:issue:`39359`)
:meth:`.Styler.apply` and :meth:`.Styler.applymap` now raise errors if wrong format CSS is passed on render (:issue:`39660`)
:meth:`.Styler.format` adds keyword argument escape for optional HTML escaping (:issue:`40437`)
:meth:`.Styler.background_gradient` now allows the ability to supply a specific gradient map (:issue:`22727`)
:meth:`.Styler.clear` now clears :attr:`Styler.hidden_index` and :attr:`Styler.hidden_columns` as well (:issue:`40484`)
Builtin highlighting methods in :class:`Styler` have a more consistent signature and css customisability (:issue:`40242`)
:meth:`.Styler.highlight_between` added to list of builtin styling methods (:issue:`39821`)
:meth:`Series.loc.__getitem__` and :meth:`Series.loc.__setitem__` with :class:`MultiIndex` now raising helpful error message when indexer has too many dimensions (:issue:`35349`)
:meth:`pandas.read_stata` and :class:`StataReader` support reading data from compressed files.
Add support for parsing ISO 8601-like timestamps with negative signs to :meth:`pandas.Timedelta` (:issue:`37172`)
Add support for unary operators in :class:`FloatingArray` (:issue:`38749`)
:class:`RangeIndex` can now be constructed by passing a range object directly e.g. pd.RangeIndex(range(3)) (:issue:`12067`)
:meth:`round` being enabled for the nullable integer and floating dtypes (:issue:`38844`)
:meth:`pandas.read_csv` and :meth:`pandas.read_json` expose the argument encoding_errors to control how encoding errors are handled (:issue:`39450`)
:meth:`.GroupBy.any` and :meth:`.GroupBy.all` use Kleene logic with nullable data types (:issue:`37506`)
:meth:`.GroupBy.any` and :meth:`.GroupBy.all` return a BooleanDtype for columns with nullable data types (:issue:`33449`)
:meth:`.GroupBy.rank` now supports object-dtype data (:issue:`38278`)
Constructing a :class:`DataFrame` or :class:`Series` with the data argument being a Python iterable that is not a NumPy ndarray consisting of NumPy scalars will now result in a dtype with a precision the maximum of the NumPy scalars; this was already the case when data is a NumPy ndarray (:issue:`40908`)
Add keyword sort to :func:`pivot_table` to allow non-sorting of the result (:issue:`39143`)
Add keyword dropna to :meth:`DataFrame.value_counts` to allow counting rows that include NA values (:issue:`41325`)
:meth:`Series.replace` will now cast results to PeriodDtype where possible instead of object dtype (:issue:`41526`)

Notable bug fixes

These are bug fixes that might have notable behavior changes.

`Categorical.unique` now always maintains same dtype as original

Previously, when calling :meth:`~Categorical.unique` with categorical data, unused categories in the new array would be removed, meaning that the dtype of the new array would be different than the original, if some categories are not present in the unique array (:issue:`18291`)

As an example of this, given:

.. ipython:: python

        dtype = pd.CategoricalDtype(['bad', 'neutral', 'good'], ordered=True)
        cat = pd.Categorical(['good', 'good', 'bad', 'bad'], dtype=dtype)
        original = pd.Series(cat)
        unique = original.unique()

pandas < 1.3.0:

In [1]: unique
['good', 'bad']
Categories (2, object): ['bad' < 'good']
In [2]: original.dtype == unique.dtype
False

pandas >= 1.3.0

.. ipython:: python

        unique
        original.dtype == unique.dtype

Preserve dtypes in :meth:`~pandas.DataFrame.combine_first`

:meth:`~pandas.DataFrame.combine_first` will now preserve dtypes (:issue:`7509`)

.. ipython:: python

   df1 = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=[0, 1, 2])
   df1
   df2 = pd.DataFrame({"B": [4, 5, 6], "C": [1, 2, 3]}, index=[2, 3, 4])
   df2
   combined = df1.combine_first(df2)

pandas 1.2.x

In [1]: combined.dtypes
Out[2]:
A    float64
B    float64
C    float64
dtype: object

pandas 1.3.0

.. ipython:: python

   combined.dtypes

Group by methods agg and transform no longer changes return dtype for callables

Previously the methods :meth:`.DataFrameGroupBy.aggregate`, :meth:`.SeriesGroupBy.aggregate`, :meth:`.DataFrameGroupBy.transform`, and :meth:`.SeriesGroupBy.transform` might cast the result dtype when the argument func is callable, possibly leading to undesirable results (:issue:`21240`). The cast would occur if the result is numeric and casting back to the input dtype does not change any values as measured by np.allclose. Now no such casting occurs.

.. ipython:: python

    df = pd.DataFrame({'key': [1, 1], 'a': [True, False], 'b': [True, True]})
    df

pandas 1.2.x

In [5]: df.groupby('key').agg(lambda x: x.sum())
Out[5]:
        a  b
key
1    True  2

pandas 1.3.0

.. ipython:: python

    df.groupby('key').agg(lambda x: x.sum())

`float` result for :meth:`.GroupBy.mean`, :meth:`.GroupBy.median`, and :meth:`.GroupBy.var`

Previously, these methods could result in different dtypes depending on the input values. Now, these methods will always return a float dtype. (:issue:`41137`)

.. ipython:: python

    df = pd.DataFrame({'a': [True], 'b': [1], 'c': [1.0]})

pandas 1.2.x

In [5]: df.groupby(df.index).mean()
Out[5]:
        a  b    c
0    True  1  1.0

pandas 1.3.0

.. ipython:: python

    df.groupby(df.index).mean()

Try operating inplace when setting values with `loc` and `iloc`

When setting an entire column using loc or iloc, pandas will try to insert the values into the existing data rather than create an entirely new array.

.. ipython:: python

   df = pd.DataFrame(range(3), columns=["A"], dtype="float64")
   values = df.values
   new = np.array([5, 6, 7], dtype="int64")
   df.loc[[0, 1, 2], "A"] = new

In both the new and old behavior, the data in values is overwritten, but in the old behavior the dtype of df["A"] changed to int64.

pandas 1.2.x

In [1]: df.dtypes
Out[1]:
A    int64
dtype: object
In [2]: np.shares_memory(df["A"].values, new)
Out[2]: False
In [3]: np.shares_memory(df["A"].values, values)
Out[3]: False

In pandas 1.3.0, df continues to share data with values

pandas 1.3.0

.. ipython:: python

   df.dtypes
   np.shares_memory(df["A"], new)
   np.shares_memory(df["A"], values)

Never Operate Inplace When Setting `frame[keys] = values`

When setting multiple columns using frame[keys] = values new arrays will replace pre-existing arrays for these keys, which will not be over-written (:issue:`39510`). As a result, the columns will retain the dtype(s) of values, never casting to the dtypes of the existing arrays.

.. ipython:: python

   df = pd.DataFrame(range(3), columns=["A"], dtype="float64")
   df[["A"]] = 5

In the old behavior, 5 was cast to float64 and inserted into the existing array backing df:

pandas 1.2.x

In [1]: df.dtypes
Out[1]:
A    float64

In the new behavior, we get a new array, and retain an integer-dtyped 5:

pandas 1.3.0

.. ipython:: python

   df.dtypes

Consistent Casting With Setting Into Boolean Series

Setting non-boolean values into a :class:`Series with ``dtype=bool`` consistently cast to dtype=object (:issue:`38709`)

.. ipython:: python

   orig = pd.Series([True, False])
   ser = orig.copy()
   ser.iloc[1] = np.nan
   ser2 = orig.copy()
   ser2.iloc[1] = 2.0

pandas 1.2.x

In [1]: ser
Out [1]:
0    1.0
1    NaN
dtype: float64

In [2]:ser2
Out [2]:
0    True
1     2.0
dtype: object

pandas 1.3.0

.. ipython:: python

   ser
   ser2

GroupBy.rolling no longer returns grouped-by column in values

The group-by column will now be dropped from the result of a groupby.rolling operation (:issue:`32262`)

.. ipython:: python

    df = pd.DataFrame({"A": [1, 1, 2, 3], "B": [0, 1, 2, 3]})
    df

Previous behavior:

In [1]: df.groupby("A").rolling(2).sum()
Out[1]:
       A    B
A
1 0  NaN  NaN
1    2.0  1.0
2 2  NaN  NaN
3 3  NaN  NaN

New behavior:

.. ipython:: python

    df.groupby("A").rolling(2).sum()

Removed artificial truncation in rolling variance and standard deviation

:meth:`core.window.Rolling.std` and :meth:`core.window.Rolling.var` will no longer artificially truncate results that are less than ~1e-8 and ~1e-15 respectively to zero (:issue:`37051`, :issue:`40448`, :issue:`39872`).

However, floating point artifacts may now exist in the results when rolling over larger values.

.. ipython:: python

   s = pd.Series([7, 5, 5, 5])
   s.rolling(3).var()

GroupBy.rolling with MultiIndex no longer drops levels in the result

:class:`core.window.rolling.RollingGroupby` will no longer drop levels of a :class:`DataFrame` with a :class:`MultiIndex` in the result. This can lead to a perceived duplication of levels in the resulting :class:`MultiIndex`, but this change restores the behavior that was present in version 1.1.3 (:issue:`38787`, :issue:`38523`).

.. ipython:: python

   index = pd.MultiIndex.from_tuples([('idx1', 'idx2')], names=['label1', 'label2'])
   df = pd.DataFrame({'a': [1], 'b': [2]}, index=index)
   df

Previous behavior:

In [1]: df.groupby('label1').rolling(1).sum()
Out[1]:
          a    b
label1
idx1    1.0  2.0

New behavior:

.. ipython:: python

    df.groupby('label1').rolling(1).sum()

Increased minimum versions for dependencies

Some minimum supported versions of dependencies were updated. If installed, we now require:

Package	Minimum Version	Required	Changed
numpy	1.17.3	X	X
pytz	2017.3	X
python-dateutil	2.7.3	X
bottleneck	1.2.1
numexpr	2.7.0		X
pytest (dev)	6.0		X
mypy (dev)	0.800		X
setuptools	38.6.0		X

For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.

Package	Minimum Version	Changed
beautifulsoup4	4.6.0
fastparquet	0.4.0	X
fsspec	0.7.4
gcsfs	0.6.0
lxml	4.3.0
matplotlib	2.2.3
numba	0.46.0
openpyxl	3.0.0	X
pyarrow	0.17.0	X
pymysql	0.8.1	X
pytables	3.5.1
s3fs	0.4.0
scipy	1.2.0
sqlalchemy	1.3.0	X
tabulate	0.8.7	X
xarray	0.12.0
xlrd	1.2.0
xlsxwriter	1.0.2
xlwt	1.3.0
pandas-gbq	0.12.0

See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.

Other API changes

Partially initialized :class:`CategoricalDtype` (i.e. those with categories=None objects will no longer compare as equal to fully initialized dtype objects.
Accessing _constructor_expanddim on a :class:`DataFrame` and _constructor_sliced on a :class:`Series` now raise an AttributeError. Previously a NotImplementedError was raised (:issue:`38782`)
Added new engine and **engine_kwargs parameters to :meth:`DataFrame.to_sql` to support other future "SQL engines". Currently we still only use SQLAlchemy under the hood, but more engines are planned to be supported such as turbodbc (:issue:`36893`)

Build

Documentation in .pptx and .pdf formats are no longer included in wheels or source distributions. (:issue:`30741`)

Deprecations

Deprecated allowing scalars to be passed to the :class:`Categorical` constructor (:issue:`38433`)
Deprecated constructing :class:`CategoricalIndex` without passing list-like data (:issue:`38944`)
Deprecated allowing subclass-specific keyword arguments in the :class:`Index` constructor, use the specific subclass directly instead (:issue:`14093`, :issue:`21311`, :issue:`22315`, :issue:`26974`)
Deprecated astype of datetimelike (timedelta64[ns], datetime64[ns], Datetime64TZDtype, PeriodDtype) to integer dtypes, use values.view(...) instead (:issue:`38544`)
Deprecated :meth:`MultiIndex.is_lexsorted` and :meth:`MultiIndex.lexsort_depth`, use :meth:`MultiIndex.is_monotonic_increasing` instead (:issue:`32259`)
Deprecated keyword try_cast in :meth:`Series.where`, :meth:`Series.mask`, :meth:`DataFrame.where`, :meth:`DataFrame.mask`; cast results manually if desired (:issue:`38836`)
Deprecated comparison of :class:`Timestamp` object with datetime.date objects. Instead of e.g. ts <= mydate use ts <= pd.Timestamp(mydate) or ts.date() <= mydate (:issue:`36131`)
Deprecated :attr:`Rolling.win_type` returning "freq" (:issue:`38963`)
Deprecated :attr:`Rolling.is_datetimelike` (:issue:`38963`)
Deprecated :class:`DataFrame` indexer for :meth:`Series.__setitem__` and :meth:`DataFrame.__setitem__` (:issue:`39004`)
Deprecated :meth:`core.window.ewm.ExponentialMovingWindow.vol` (:issue:`39220`)
Using .astype to convert between datetime64[ns] dtype and :class:`DatetimeTZDtype` is deprecated and will raise in a future version, use obj.tz_localize or obj.dt.tz_localize instead (:issue:`38622`)
Deprecated casting datetime.date objects to datetime64 when used as fill_value in :meth:`DataFrame.unstack`, :meth:`DataFrame.shift`, :meth:`Series.shift`, and :meth:`DataFrame.reindex`, pass pd.Timestamp(dateobj) instead (:issue:`39767`)
Deprecated :meth:`.Styler.set_na_rep` and :meth:`.Styler.set_precision` in favour of :meth:`.Styler.format` with na_rep and precision as existing and new input arguments respectively (:issue:`40134`, :issue:`40425`)
Deprecated allowing partial failure in :meth:`Series.transform` and :meth:`DataFrame.transform` when func is list-like or dict-like and raises anything but TypeError; func raising anything but a TypeError will raise in a future version (:issue:`40211`)
Deprecated arguments error_bad_lines and warn_bad_lines in :meth:read_csv and :meth:read_table in favor of argument on_bad_lines (:issue:`15122`)
Deprecated support for np.ma.mrecords.MaskedRecords in the :class:`DataFrame` constructor, pass {name: data[name] for name in data.dtype.names} instead (:issue:`40363`)
Deprecated using :func:`merge` or :func:`join` on a different number of levels (:issue:`34862`)
Deprecated the use of **kwargs in :class:`.ExcelWriter`; use the keyword argument engine_kwargs instead (:issue:`40430`)
Deprecated the level keyword for :class:`DataFrame` and :class:`Series` aggregations; use groupby instead (:issue:`39983`)
The inplace parameter of :meth:`Categorical.remove_categories`, :meth:`Categorical.add_categories`, :meth:`Categorical.reorder_categories`, :meth:`Categorical.rename_categories`, :meth:`Categorical.set_categories` is deprecated and will be removed in a future version (:issue:`37643`)
Deprecated :func:`merge` producing duplicated columns through the suffixes keyword and already existing columns (:issue:`22818`)
Deprecated setting :attr:`Categorical._codes`, create a new :class:`Categorical` with the desired codes instead (:issue:`40606`)
Deprecated the convert_float optional argument in :func:`read_excel` and :meth:`ExcelFile.parse` (:issue:`41127`)
Deprecated behavior of :meth:`DatetimeIndex.union` with mixed timezones; in a future version both will be cast to UTC instead of object dtype (:issue:`39328`)
Deprecated using usecols with out of bounds indices for read_csv with engine="c" (:issue:`25623`)
Deprecated passing arguments as positional (except for "codes") in :meth:`MultiIndex.codes` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`Index.set_names` and :meth:`MultiIndex.set_names` (except for names) (:issue:`41485`)
Deprecated passing arguments (apart from cond and other) as positional in :meth:`DataFrame.mask` and :meth:`Series.mask` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.clip` and :meth:`Series.clip` (other than "upper" and "lower") (:issue:`41485`)
Deprecated special treatment of lists with first element a Categorical in the :class:`DataFrame` constructor; pass as pd.DataFrame({col: categorical, ...}) instead (:issue:`38845`)
Deprecated passing arguments as positional (except for "method") in :meth:`DataFrame.interpolate` and :meth:`Series.interpolate` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.ffill`, :meth:`Series.ffill`, :meth:`DataFrame.bfill`, and :meth:`Series.bfill` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.sort_values` (other than "by") and :meth:`Series.sort_values` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.dropna` and :meth:`Series.dropna` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.set_index` (other than "keys") (:issue:`41485`)
Deprecated passing arguments as positional (except for "levels") in :meth:`MultiIndex.set_levels` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.drop_duplicates` (except for subset), :meth:`Series.drop_duplicates`, :meth:`Index.drop_duplicates` and :meth:`MultiIndex.drop_duplicates`(:issue:`41485`)
Deprecated passing arguments (apart from value) as positional in :meth:`DataFrame.fillna` and :meth:`Series.fillna` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.reset_index` (other than "level") and :meth:`Series.reset_index` (:issue:`41485`)
Deprecated construction of :class:`Series` or :class:`DataFrame` with DatetimeTZDtype data and datetime64[ns] dtype. Use Series(data).dt.tz_localize(None) instead (:issue:`41555`,:issue:33401)
Deprecated passing arguments as positional in :meth:`DataFrame.set_axis` and :meth:`Series.set_axis` (other than "labels") (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.where` and :meth:`Series.where` (other than "cond" and "other") (:issue:`41485`)
Deprecated passing arguments as positional (other than filepath_or_buffer) in :func:`read_csv` (:issue:`41485`)
Deprecated passing arguments as positional in :meth:`DataFrame.drop` (other than "labels") and :meth:`Series.drop` (:issue:`41485`)

Deprecated Dropping Nuisance Columns in DataFrame Reductions and DataFrameGroupBy Operations

The default of calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with numeric_only=None (the default, columns on which the reduction raises TypeError are silently ignored and dropped from the result.

This behavior is deprecated. In a future version, the TypeError will be raised, and users will need to select only valid columns before calling the function.

For example:

.. ipython:: python

   df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)})
   df

Old behavior:

In [3]: df.prod()
Out[3]:
Out[3]:
A    24
dtype: int64

Future behavior:

In [4]: df.prod()
...
TypeError: 'DatetimeArray' does not implement reduction 'prod'

In [5]: df[["A"]].prod()
Out[5]:
A    24
dtype: int64

Similarly, when applying a function to :class:`DataFrameGroupBy`, columns on which the function raises TypeError are currently silently ignored and dropped from the result.

This behavior is deprecated. In a future version, the TypeError will be raised, and users will need to select only valid columns before calling the function.

For example:

.. ipython:: python

   df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)})
   gb = df.groupby([1, 1, 2, 2])

Old behavior:

In [4]: gb.prod(numeric_only=False)
Out[4]:
A
1   2
2  12

In [5]: gb.prod(numeric_only=False)
...
TypeError: datetime64 type does not support prod operations

In [6]: gb[["A"]].prod(numeric_only=False)
Out[6]:
    A
1   2
2  12

Performance improvements

Performance improvement in :meth:`IntervalIndex.isin` (:issue:`38353`)
Performance improvement in :meth:`Series.mean` for nullable data types (:issue:`34814`)
Performance improvement in :meth:`Series.isin` for nullable data types (:issue:`38340`)
Performance improvement in :meth:`DataFrame.fillna` with method="pad|backfill" for nullable floating and nullable integer dtypes (:issue:`39953`)
Performance improvement in :meth:`DataFrame.corr` for method=kendall (:issue:`28329`)
Performance improvement in :meth:`core.window.rolling.Rolling.corr` and :meth:`core.window.rolling.Rolling.cov` (:issue:`39388`)
Performance improvement in :meth:`core.window.rolling.RollingGroupby.corr`, :meth:`core.window.expanding.ExpandingGroupby.corr`, :meth:`core.window.expanding.ExpandingGroupby.corr` and :meth:`core.window.expanding.ExpandingGroupby.cov` (:issue:`39591`)
Performance improvement in :func:`unique` for object data type (:issue:`37615`)
Performance improvement in :func:`pd.json_normalize` for basic cases (including separators) (:issue:`40035` :issue:`15621`)
Performance improvement in :class:`core.window.rolling.ExpandingGroupby` aggregation methods (:issue:`39664`)
Performance improvement in :class:`Styler` where render times are more than 50% reduced (:issue:`39972` :issue:`39952`)
Performance improvement in :meth:`core.window.ewm.ExponentialMovingWindow.mean` with times (:issue:`39784`)
Performance improvement in :meth:`.GroupBy.apply` when requiring the python fallback implementation (:issue:`40176`)
Performance improvement in the conversion of pyarrow boolean array to a pandas nullable boolean array (:issue:`41051`)
Performance improvement for concatenation of data with type :class:`CategoricalDtype` (:issue:`40193`)
Performance improvement in :meth:`.GroupBy.cummin` and :meth:`.GroupBy.cummax` with nullable data types (:issue:`37493`)
Performance improvement in :meth:`Series.nunique` with nan values (:issue:`40865`)
Performance improvement in :meth:`DataFrame.transpose`, :meth:`Series.unstack` with DatetimeTZDtype (:issue:`40149`)

Bug fixes

Categorical

Bug in :class:`CategoricalIndex` incorrectly failing to raise TypeError when scalar data is passed (:issue:`38614`)
Bug in CategoricalIndex.reindex failed when Index passed with elements all in category (:issue:`28690`)
Bug where constructing a :class:`Categorical` from an object-dtype array of date objects did not round-trip correctly with astype (:issue:`38552`)
Bug in constructing a :class:`DataFrame` from an ndarray and a :class:`CategoricalDtype` (:issue:`38857`)
Bug in :meth:`DataFrame.reindex` was throwing IndexError when new index contained duplicates and old index was :class:`CategoricalIndex` (:issue:`38906`)
Bug in setting categorical values into an object-dtype column in a :class:`DataFrame` (:issue:`39136`)
Bug in :meth:`DataFrame.reindex` was raising IndexError when new index contained duplicates and old index was :class:`CategoricalIndex` (:issue:`38906`)

Datetimelike

Bug in :class:`DataFrame` and :class:`Series` constructors sometimes dropping nanoseconds from :class:`Timestamp` (resp. :class:`Timedelta`) data, with dtype=datetime64[ns] (resp. timedelta64[ns]) (:issue:`38032`)
Bug in :meth:`DataFrame.first` and :meth:`Series.first` returning two months for offset one month when first day is last calendar day (:issue:`29623`)
Bug in constructing a :class:`DataFrame` or :class:`Series` with mismatched datetime64 data and timedelta64 dtype, or vice-versa, failing to raise TypeError (:issue:`38575`, :issue:`38764`, :issue:`38792`)
Bug in constructing a :class:`Series` or :class:`DataFrame` with a datetime object out of bounds for datetime64[ns] dtype or a timedelta object out of bounds for timedelta64[ns] dtype (:issue:`38792`, :issue:`38965`)
Bug in :meth:`DatetimeIndex.intersection`, :meth:`DatetimeIndex.symmetric_difference`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38741`)
Bug in :meth:`Series.where` incorrectly casting datetime64 values to int64 (:issue:`37682`)
Bug in :class:`Categorical` incorrectly typecasting datetime object to Timestamp (:issue:`38878`)
Bug in comparisons between :class:`Timestamp` object and datetime64 objects just outside the implementation bounds for nanosecond datetime64 (:issue:`39221`)
Bug in :meth:`Timestamp.round`, :meth:`Timestamp.floor`, :meth:`Timestamp.ceil` for values near the implementation bounds of :class:`Timestamp` (:issue:`39244`)
Bug in :meth:`Timedelta.round`, :meth:`Timedelta.floor`, :meth:`Timedelta.ceil` for values near the implementation bounds of :class:`Timedelta` (:issue:`38964`)
Bug in :func:`date_range` incorrectly creating :class:`DatetimeIndex` containing NaT instead of raising OutOfBoundsDatetime in corner cases (:issue:`24124`)
Bug in :func:`infer_freq` incorrectly fails to infer 'H' frequency of :class:`DatetimeIndex` if the latter has a timezone and crosses DST boundaries (:issue:`39556`)

Timedelta

Bug in constructing :class:`Timedelta` from np.timedelta64 objects with non-nanosecond units that are out of bounds for timedelta64[ns] (:issue:`38965`)
Bug in constructing a :class:`TimedeltaIndex` incorrectly accepting np.datetime64("NaT") objects (:issue:`39462`)
Bug in constructing :class:`Timedelta` from input string with only symbols and no digits failed to raise an error (:issue:`39710`)
Bug in :class:`TimedeltaIndex` and :func:`to_timedelta` failing to raise when passed non-nanosecond timedelta64 arrays that overflow when converting to timedelta64[ns] (:issue:`40008`)

Timezones

Bug in different tzinfo objects representing UTC not being treated as equivalent (:issue:`39216`)
Bug in dateutil.tz.gettz("UTC") not being recognized as equivalent to other UTC-representing tzinfos (:issue:`39276`)

Numeric

Bug in :meth:`DataFrame.quantile`, :meth:`DataFrame.sort_values` causing incorrect subsequent indexing behavior (:issue:`38351`)
Bug in :meth:`DataFrame.sort_values` raising an :class:`IndexError` for empty by (:issue:`40258`)
Bug in :meth:`DataFrame.select_dtypes` with include=np.number now retains numeric ExtensionDtype columns (:issue:`35340`)
Bug in :meth:`DataFrame.mode` and :meth:`Series.mode` not keeping consistent integer :class:`Index` for empty input (:issue:`33321`)
Bug in :meth:`DataFrame.rank` with np.inf and mixture of np.nan and np.inf (:issue:`32593`)
Bug in :meth:`DataFrame.rank` with axis=0 and columns holding incomparable types raising IndexError (:issue:`38932`)
Bug in rank method for :class:`Series`, :class:`DataFrame`, :class:`DataFrameGroupBy`, and :class:`SeriesGroupBy` treating the most negative int64 value as missing (:issue:`32859`)
Bug in :func:`select_dtypes` different behavior between Windows and Linux with include="int" (:issue:`36569`)
Bug in :meth:`DataFrame.apply` and :meth:`DataFrame.agg` when passed argument func="size" would operate on the entire DataFrame instead of rows or columns (:issue:`39934`)
Bug in :meth:`DataFrame.transform` would raise SpecificationError when passed a dictionary and columns were missing; will now raise a KeyError instead (:issue:`40004`)
Bug in :meth:`DataFrameGroupBy.rank` giving incorrect results with pct=True and equal values between consecutive groups (:issue:`40518`)
Bug in :meth:`Series.count` would result in an int32 result on 32-bit platforms when argument level=None (:issue:`40908`)
Bug in :class:`Series` and :class:`DataFrame` reductions with methods any and all not returning boolean results for object data (:issue:`12863`, :issue:`35450`, :issue:`27709`)
Bug in :meth:`Series.clip` would fail if series contains NA values and has nullable int or float as a data type (:issue:`40851`)

Conversion

Bug in :meth:`Series.to_dict` with orient='records' now returns python native types (:issue:`25969`)
Bug in :meth:`Series.view` and :meth:`Index.view` when converting between datetime-like (datetime64[ns], datetime64[ns, tz], timedelta64, period) dtypes (:issue:`39788`)
Bug in creating a :class:`DataFrame` from an empty np.recarray not retaining the original dtypes (:issue:`40121`)
Bug in :class:`DataFrame` failing to raise TypeError when constructing from a frozenset (:issue:`40163`)
Bug in :class:`Index` construction silently ignoring a passed dtype when the data cannot be cast to that dtype (:issue:`21311`)
Bug in :meth:`StringArray.astype` falling back to numpy and raising when converting to dtype='categorical' (:issue:`40450`)
Bug in :func:`factorize` where, when given an array with a numeric numpy dtype lower than int64, uint64 and float64, the unique values did not keep their original dtype (:issue:`41132`)
Bug in :class:`DataFrame` construction with a dictionary containing an arraylike with ExtensionDtype and copy=True failing to make a copy (:issue:`38939`)
Bug in :meth:`qcut` raising error when taking Float64DType as input (:issue:`40730`)
Bug in :class:`DataFrame` and :class:`Series` construction with datetime64[ns] data and dtype=object resulting in datetime objects instead of :class:`Timestamp` objects (:issue:`41599`)
Bug in :class:`DataFrame` and :class:`Series` construction with timedelta64[ns] data and dtype=object resulting in np.timedelta64 objects instead of :class:`Timedelta` objects (:issue:`41599`)

Strings

Bug in the conversion from pyarrow.ChunkedArray to :class:`~arrays.StringArray` when the original had zero chunks (:issue:`41040`)
Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` ignoring replacements with regex=True for StringDType data (:issue:`41333`, :issue:`35977`)
Bug in :meth:`Series.str.extract` with :class:`~arrays.StringArray` returning object dtype for empty :class:`DataFrame` (:issue:`41441`)
Bug in :meth:`Series.str.replace` where the case argument was ignored when regex=False (:issue:`41602`)

Interval

Bug in :meth:`IntervalIndex.intersection` and :meth:`IntervalIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38653`, :issue:`38741`)
Bug in :meth:`IntervalIndex.intersection` returning duplicates when at least one of both Indexes has duplicates which are present in the other (:issue:`38743`)
:meth:`IntervalIndex.union`, :meth:`IntervalIndex.intersection`, :meth:`IntervalIndex.difference`, and :meth:`IntervalIndex.symmetric_difference` now cast to the appropriate dtype instead of raising TypeError when operating with another :class:`IntervalIndex` with incompatible dtype (:issue:`39267`)
:meth:`PeriodIndex.union`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference`, :meth:`PeriodIndex.difference` now cast to object dtype instead of raising IncompatibleFrequency when operating with another :class:`PeriodIndex` with incompatible dtype (:issue:`??`)

Indexing

Bug in :meth:`Index.union` and :meth:`MultiIndex.union` dropping duplicate Index values when Index was not monotonic or sort was set to False (:issue:`36289`, :issue:`31326`, :issue:`40862`)
Bug in :meth:`CategoricalIndex.get_indexer` failing to raise InvalidIndexError when non-unique (:issue:`38372`)
Bug in :meth:`Series.loc` raising ValueError when input was filtered with a boolean list and values to set were a list with lower dimension (:issue:`20438`)
Bug in inserting many new columns into a :class:`DataFrame` causing incorrect subsequent indexing behavior (:issue:`38380`)
Bug in :meth:`DataFrame.__setitem__` raising ValueError when setting multiple values to duplicate columns (:issue:`15695`)
Bug in :meth:`DataFrame.loc`, :meth:`Series.loc`, :meth:`DataFrame.__getitem__` and :meth:`Series.__getitem__` returning incorrect elements for non-monotonic :class:`DatetimeIndex` for string slices (:issue:`33146`)
Bug in :meth:`DataFrame.reindex` and :meth:`Series.reindex` with timezone aware indexes raising TypeError for method="ffill" and method="bfill" and specified tolerance (:issue:`38566`)
Bug in :meth:`DataFrame.reindex` with datetime64[ns] or timedelta64[ns] incorrectly casting to integers when the fill_value requires casting to object dtype (:issue:`39755`)
Bug in :meth:`DataFrame.__setitem__` raising ValueError with empty :class:`DataFrame` and specified columns for string indexer and non empty :class:`DataFrame` to set (:issue:`38831`)
Bug in :meth:`DataFrame.loc.__setitem__` raising ValueError when expanding unique column for :class:`DataFrame` with duplicate columns (:issue:`38521`)
Bug in :meth:`DataFrame.iloc.__setitem__` and :meth:`DataFrame.loc.__setitem__` with mixed dtypes when setting with a dictionary value (:issue:`38335`)
Bug in :meth:`Series.loc.__setitem__` and :meth:`DataFrame.loc.__setitem__` raising KeyError for boolean Iterator indexer (:issue:`39614`)
Bug in :meth:`Series.iloc` and :meth:`DataFrame.iloc` raising KeyError for Iterator indexer (:issue:`39614`)
Bug in :meth:`DataFrame.__setitem__` not raising ValueError when right hand side is a :class:`DataFrame` with wrong number of columns (:issue:`38604`)
Bug in :meth:`Series.__setitem__` raising ValueError when setting a :class:`Series` with a scalar indexer (:issue:`38303`)
Bug in :meth:`DataFrame.loc` dropping levels of :class:`MultiIndex` when :class:`DataFrame` used as input has only one row (:issue:`10521`)
Bug in :meth:`DataFrame.__getitem__` and :meth:`Series.__getitem__` always raising KeyError when slicing with existing strings an :class:`Index` with milliseconds (:issue:`33589`)
Bug in setting timedelta64 or datetime64 values into numeric :class:`Series` failing to cast to object dtype (:issue:`39086`, issue:39619)
Bug in setting :class:`Interval` values into a :class:`Series` or :class:`DataFrame` with mismatched :class:`IntervalDtype` incorrectly casting the new values to the existing dtype (:issue:`39120`)
Bug in setting datetime64 values into a :class:`Series` with integer-dtype incorrect casting the datetime64 values to integers (:issue:`39266`)
Bug in setting np.datetime64("NaT") into a :class:`Series` with :class:`Datetime64TZDtype` incorrectly treating the timezone-naive value as timezone-aware (:issue:`39769`)
Bug in :meth:`Index.get_loc` not raising KeyError when method is specified for NaN value when NaN is not in :class:`Index` (:issue:`39382`)
Bug in :meth:`DatetimeIndex.insert` when inserting np.datetime64("NaT") into a timezone-aware index incorrectly treating the timezone-naive value as timezone-aware (:issue:`39769`)
Bug in incorrectly raising in :meth:`Index.insert`, when setting a new column that cannot be held in the existing frame.columns, or in :meth:`Series.reset_index` or :meth:`DataFrame.reset_index` instead of casting to a compatible dtype (:issue:`39068`)
Bug in :meth:`RangeIndex.append` where a single object of length 1 was concatenated incorrectly (:issue:`39401`)
Bug in :meth:`RangeIndex.astype` where when converting to :class:`CategoricalIndex`, the categories became a :class:`Int64Index` instead of a :class:`RangeIndex` (:issue:`41263`)
Bug in setting numpy.timedelta64 values into an object-dtype :class:`Series` using a boolean indexer (:issue:`39488`)
Bug in setting numeric values into a into a boolean-dtypes :class:`Series` using at or iat failing to cast to object-dtype (:issue:`39582`)
Bug in :meth:`DataFrame.__setitem__` and :meth:`DataFrame.iloc.__setitem__` raising ValueError when trying to index with a row-slice and setting a list as values (:issue:`40440`)
Bug in :meth:`DataFrame.loc` not raising KeyError when key was not found in :class:`MultiIndex` when levels contain more values than used (:issue:`41170`)
Bug in :meth:`DataFrame.loc.__setitem__` when setting-with-expansion incorrectly raising when the index in the expanding axis contains duplicates (:issue:`40096`)
Bug in :meth:`DataFrame.loc.__getitem__` with :class:`MultiIndex` casting to float when at least one column is from has float dtype and we retrieve a scalar (:issue:`41369`)
Bug in :meth:`DataFrame.loc` incorrectly matching non-boolean index elements (:issue:`20432`)
Bug in :meth:`Series.__delitem__` with ExtensionDtype incorrectly casting to ndarray (:issue:`40386`)
Bug in :meth:`DataFrame.loc` returning :class:`MultiIndex` in wrong order if indexer has duplicates (:issue:`40978`)
Bug in :meth:`DataFrame.__setitem__` raising TypeError when using a str subclass as the column name with a :class:`DatetimeIndex` (:issue:`37366`)

Missing

Bug in :class:`Grouper` now correctly propagates dropna argument and :meth:`DataFrameGroupBy.transform` now correctly handles missing values for dropna=True (:issue:`35612`)
Bug in :func:`isna`, and :meth:`Series.isna`, :meth:`Index.isna`, :meth:`DataFrame.isna` (and the corresponding notna functions) not recognizing Decimal("NaN") objects (:issue:`39409`)
Bug in :meth:`DataFrame.fillna` not accepting dictionary for downcast keyword (:issue:`40809`)
Bug in :func:`isna` not returning a copy of the mask for nullable types, causing any subsequent mask modification to change the original array (:issue:`40935`)
Bug in :class:`DataFrame` construction with float data containing NaN and an integer dtype casting instead of retaining the NaN (:issue:`26919`)

MultiIndex

Bug in :meth:`DataFrame.drop` raising TypeError when :class:`MultiIndex` is non-unique and level is not provided (:issue:`36293`)
Bug in :meth:`MultiIndex.intersection` duplicating NaN in result (:issue:`38623`)
Bug in :meth:`MultiIndex.equals` incorrectly returning True when :class:`MultiIndex` containing NaN even when they are differently ordered (:issue:`38439`)
Bug in :meth:`MultiIndex.intersection` always returning empty when intersecting with :class:`CategoricalIndex` (:issue:`38653`)
Bug in :meth:`MultiIndex.reindex` raising ValueError with empty MultiIndex and indexing only a specific level (:issue:`41170`)

I/O

Bug in :meth:`Index.__repr__` when display.max_seq_items=1 (:issue:`38415`)
Bug in :func:`read_csv` not recognizing scientific notation if decimal is set for engine="python" (:issue:`31920`)
Bug in :func:`read_csv` interpreting NA value as comment, when NA does contain the comment string fixed for engine="python" (:issue:`34002`)
Bug in :func:`read_csv` raising IndexError with multiple header columns and index_col specified when file has no data rows (:issue:`38292`)
Bug in :func:`read_csv` not accepting usecols with different length than names for engine="python" (:issue:`16469`)
Bug in :meth:`read_csv` returning object dtype when delimiter="," with usecols and parse_dates specified for engine="python" (:issue:`35873`)
Bug in :func:`read_csv` raising TypeError when names and parse_dates is specified for engine="c" (:issue:`33699`)
Bug in :func:`read_clipboard`, :func:`DataFrame.to_clipboard` not working in WSL (:issue:`38527`)
Allow custom error values for parse_dates argument of :func:`read_sql`, :func:`read_sql_query` and :func:`read_sql_table` (:issue:`35185`)
Bug in :func:`to_hdf` raising KeyError when trying to apply for subclasses of DataFrame or Series (:issue:`33748`)
Bug in :meth:`~HDFStore.put` raising a wrong TypeError when saving a DataFrame with non-string dtype (:issue:`34274`)
Bug in :func:`json_normalize` resulting in the first element of a generator object not being included in the returned DataFrame (:issue:`35923`)
Bug in :func:`read_csv` applying thousands separator to date columns when column should be parsed for dates and usecols is specified for engine="python" (:issue:`39365`)
Bug in :func:`read_excel` forward filling :class:`MultiIndex` names with multiple header and index columns specified (:issue:`34673`)
:func:`read_excel` now respects :func:`set_option` (:issue:`34252`)
Bug in :func:`read_csv` not switching true_values and false_values for nullable boolean dtype (:issue:`34655`)
Bug in :func:`read_json` when orient="split" does not maintain numeric string index (:issue:`28556`)
:meth:`read_sql` returned an empty generator if chunksize was no-zero and the query returned no results. Now returns a generator with a single empty dataframe (:issue:`34411`)
Bug in :func:`read_hdf` returning unexpected records when filtering on categorical string columns using where parameter (:issue:`39189`)
Bug in :func:`read_sas` raising ValueError when datetimes were null (:issue:`39725`)
Bug in :func:`read_excel` dropping empty values from single-column spreadsheets (:issue:`39808`)
Bug in :func:`read_excel` loading trailing empty rows/columns for some filetypes (:issue:`41167`)
Bug in :func:`read_excel` raising AttributeError with MultiIndex header followed by two empty rows and no index, and bug affecting :func:`read_excel`, :func:`read_csv`, :func:`read_table`, :func:`read_fwf`, and :func:`read_clipboard` where one blank row after a MultiIndex header with no index would be dropped (:issue:`40442`)
Bug in :meth:`DataFrame.to_string` misplacing the truncation column when index=False (:issue:`40904`)
Bug in :meth:`DataFrame.to_string` adding an extra dot and misaligning the truncation row when index=False (:issue:`40904`)
Bug in :func:`read_orc` always raising AttributeError (:issue:`40918`)
Bug in :func:`read_csv` and :func:`read_table` silently ignoring prefix if names and prefix are defined, now raising ValueError (:issue:`39123`)
Bug in :func:`read_csv` and :func:`read_excel` not respecting dtype for duplicated column name when mangle_dupe_cols is set to True (:issue:`35211`)
Bug in :func:`read_csv` silently ignoring sep if delimiter and sep are defined, now raising ValueError (:issue:`39823`)
Bug in :func:`read_csv` and :func:`read_table` misinterpreting arguments when sys.setprofile had been previously called (:issue:`41069`)
Bug in the conversion from pyarrow to pandas (e.g. for reading Parquet) with nullable dtypes and a pyarrow array whose data buffer size is not a multiple of dtype size (:issue:`40896`)
Bug in :func:`read_excel` would raise an error when pandas could not determine the file type, even when user specified the engine argument (:issue:`41225`)
Bug in :func:`read_clipboard` copying from an excel file shifts values into the wrong column if there are null values in first column (:issue:`41108`)

Period

Comparisons of :class:`Period` objects or :class:`Index`, :class:`Series`, or :class:`DataFrame` with mismatched PeriodDtype now behave like other mismatched-type comparisons, returning False for equals, True for not-equal, and raising TypeError for inequality checks (:issue:`39274`)

Plotting

Bug in :func:`scatter_matrix` raising when 2d ax argument passed (:issue:`16253`)
Prevent warnings when matplotlib's constrained_layout is enabled (:issue:`25261`)
Bug in :func:`DataFrame.plot` was showing the wrong colors in the legend if the function was called repeatedly and some calls used yerr while others didn't (partial fix of :issue:`39522`)
Bug in :func:`DataFrame.plot` was showing the wrong colors in the legend if the function was called repeatedly and some calls used secondary_y and others use legend=False (:issue:`40044`)
Bug in :meth:`DataFrame.plot.box` in box plot when dark_background theme was selected, caps or min/max markers for the plot was not visible (:issue:`40769`)

Groupby/resample/rolling

Bug in :meth:`DataFrameGroupBy.agg` and :meth:`SeriesGroupBy.agg` with :class:`PeriodDtype` columns incorrectly casting results too aggressively (:issue:`38254`)
Bug in :meth:`SeriesGroupBy.value_counts` where unobserved categories in a grouped categorical series were not tallied (:issue:`38672`)
Bug in :meth:`SeriesGroupBy.value_counts` where error was raised on an empty series (:issue:`39172`)
Bug in :meth:`.GroupBy.indices` would contain non-existent indices when null values were present in the groupby keys (:issue:`9304`)
Fixed bug in :meth:`DataFrameGroupBy.sum` and :meth:`SeriesGroupBy.sum` causing loss of precision through using Kahan summation (:issue:`38778`)
Fixed bug in :meth:`DataFrameGroupBy.cumsum`, :meth:`SeriesGroupBy.cumsum`, :meth:`DataFrameGroupBy.mean` and :meth:`SeriesGroupBy.mean` causing loss of precision through using Kahan summation (:issue:`38934`)
Bug in :meth:`.Resampler.aggregate` and :meth:`DataFrame.transform` raising TypeError instead of SpecificationError when missing keys had mixed dtypes (:issue:`39025`)
Bug in :meth:`.DataFrameGroupBy.idxmin` and :meth:`.DataFrameGroupBy.idxmax` with ExtensionDtype columns (:issue:`38733`)
Bug in :meth:`Series.resample` would raise when the index was a :class:`PeriodIndex` consisting of NaT (:issue:`39227`)
Bug in :meth:`core.window.rolling.RollingGroupby.corr` and :meth:`core.window.expanding.ExpandingGroupby.corr` where the groupby column would return 0 instead of np.nan when providing other that was longer than each group (:issue:`39591`)
Bug in :meth:`core.window.expanding.ExpandingGroupby.corr` and :meth:`core.window.expanding.ExpandingGroupby.cov` where 1 would be returned instead of np.nan when providing other that was longer than each group (:issue:`39591`)
Bug in :meth:`.GroupBy.mean`, :meth:`.GroupBy.median` and :meth:`DataFrame.pivot_table` not propagating metadata (:issue:`28283`)
Bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` not calculating window bounds correctly when window is an offset and dates are in descending order (:issue:`40002`)
Bug in :class:`SeriesGroupBy` and :class:`DataFrameGroupBy` on an empty Series or DataFrame would lose index, columns, and/or data types when directly using the methods idxmax, idxmin, mad, min, max, sum, prod, and skew or using them through apply, aggregate, or resample (:issue:`26411`)
Bug in :meth:`DataFrameGroupBy.apply` where a :class:`MultiIndex` would be created instead of an :class:`Index` if a :class:`:meth:`core.window.rolling.RollingGroupby` object was created (:issue:`39732`)
Bug in :meth:`DataFrameGroupBy.sample` where error was raised when weights was specified and the index was an :class:`Int64Index` (:issue:`39927`)
Bug in :meth:`DataFrameGroupBy.aggregate` and :meth:`.Resampler.aggregate` would sometimes raise SpecificationError when passed a dictionary and columns were missing; will now always raise a KeyError instead (:issue:`40004`)
Bug in :meth:`DataFrameGroupBy.sample` where column selection was not applied to sample result (:issue:`39928`)
Bug in :class:`core.window.ewm.ExponentialMovingWindow` when calling __getitem__ would incorrectly raise a ValueError when providing times (:issue:`40164`)
Bug in :class:`core.window.ewm.ExponentialMovingWindow` when calling __getitem__ would not retain com, span, alpha or halflife attributes (:issue:`40164`)
:class:`core.window.ewm.ExponentialMovingWindow` now raises a NotImplementedError when specifying times with adjust=False due to an incorrect calculation (:issue:`40098`)
Bug in :meth:`core.window.ewm.ExponentialMovingWindowGroupby.mean` where the times argument was ignored when engine='numba' (:issue:`40951`)
Bug in :meth:`core.window.ewm.ExponentialMovingWindowGroupby.mean` where the wrong times were used in case of multiple groups (:issue:`40951`)
Bug in :class:`core.window.ewm.ExponentialMovingWindowGroupby` where the times vector and values became out of sync for non-trivial groups (:issue:`40951`)
Bug in :meth:`Series.asfreq` and :meth:`DataFrame.asfreq` dropping rows when the index is not sorted (:issue:`39805`)
Bug in aggregation functions for :class:`DataFrame` not respecting numeric_only argument when level keyword was given (:issue:`40660`)
Bug in :meth:`SeriesGroupBy.aggregate` where using a user-defined function to aggregate a Series with an object-typed :class:`Index` causes an incorrect :class:`Index` shape (issue:40014)
Bug in :class:`core.window.RollingGroupby` where as_index=False argument in groupby was ignored (:issue:`39433`)
Bug in :meth:`.GroupBy.any` and :meth:`.GroupBy.all` raising ValueError when using with nullable type columns holding NA even with skipna=True (:issue:`40585`)
Bug in :meth:`GroupBy.cummin` and :meth:`GroupBy.cummax` incorrectly rounding integer values near the int64 implementations bounds (:issue:`40767`)
Bug in :meth:`.GroupBy.rank` with nullable dtypes incorrectly raising TypeError (:issue:`41010`)
Bug in :meth:`.GroupBy.cummin` and :meth:`.GroupBy.cummax` computing wrong result with nullable data types too large to roundtrip when casting to float (:issue:`37493`)
Bug in :meth:`DataFrame.rolling` returning mean zero for all NaN window with min_periods=0 if calculation is not numerical stable (:issue:`41053`)
Bug in :meth:`DataFrame.rolling` returning sum not zero for all NaN window with min_periods=0 if calculation is not numerical stable (:issue:`41053`)
Bug in :meth:`SeriesGroupBy.agg` failing to retain ordered :class:`CategoricalDtype` on order-preserving aggregations (:issue:`41147`)
Bug in :meth:`DataFrameGroupBy.min` and :meth:`DataFrameGroupBy.max` with multiple object-dtype columns and numeric_only=False incorrectly raising ValueError (:issue:41111`)
Bug in :meth:`DataFrameGroupBy.rank` with the GroupBy object's axis=0 and the rank method's keyword axis=1 (:issue:`41320`)
Bug in :meth:`DataFrameGroupBy.__getitem__` with non-unique columns incorrectly returning a malformed :class:`SeriesGroupBy` instead of :class:`DataFrameGroupBy` (:issue:`41427`)
Bug in :meth:`DataFrameGroupBy.transform` with non-unique columns incorrectly raising AttributeError (:issue:`41427`)
Bug in :meth:`Resampler.apply` with non-unique columns incorrectly dropping duplicated columns (:issue:`41445`)
Bug in :meth:`SeriesGroupBy` aggregations incorrectly returning empty :class:`Series` instead of raising TypeError on aggregations that are invalid for its dtype, e.g. .prod with datetime64[ns] dtype (:issue:`41342`)
Bug in :meth:`DataFrame.rolling.__iter__` where on was not assigned to the index of the resulting objects (:issue:`40373`)
Bug in :meth:`DataFrameGroupBy.transform` and :meth:`DataFrameGroupBy.agg` with engine="numba" where *args were being cached with the user passed function (:issue:`41647`)

Reshaping

Bug in :func:`merge` raising error when performing an inner join with partial index and right_index when no overlap between indices (:issue:`33814`)
Bug in :meth:`DataFrame.unstack` with missing levels led to incorrect index names (:issue:`37510`)
Bug in :func:`merge_asof` propagating the right Index with left_index=True and right_on specification instead of left Index (:issue:`33463`)
Bug in :func:`join` over :class:`MultiIndex` returned wrong result, when one of both indexes had only one level (:issue:`36909`)
:meth:`merge_asof` raises ValueError instead of cryptic TypeError in case of non-numerical merge columns (:issue:`29130`)
Bug in :meth:`DataFrame.join` not assigning values correctly when having :class:`MultiIndex` where at least one dimension is from dtype Categorical with non-alphabetically sorted categories (:issue:`38502`)
:meth:`Series.value_counts` and :meth:`Series.mode` return consistent keys in original order (:issue:`12679`, :issue:`11227` and :issue:`39007`)
Bug in :meth:`DataFrame.stack` not handling NaN in :class:`MultiIndex` columns correct (:issue:`39481`)
Bug in :meth:`DataFrame.apply` would give incorrect results when used with a string argument and axis=1 when the axis argument was not supported and now raises a ValueError instead (:issue:`39211`)
Bug in :meth:`DataFrame.sort_values` not reshaping index correctly after sorting on columns, when ignore_index=True (:issue:`39464`)
Bug in :meth:`DataFrame.append` returning incorrect dtypes with combinations of ExtensionDtype dtypes (:issue:`39454`)
Bug in :meth:`DataFrame.append` returning incorrect dtypes with combinations of datetime64 and timedelta64 dtypes (:issue:`39574`)
Bug in :meth:`DataFrame.pivot_table` returning a MultiIndex for a single value when operating on and empty DataFrame (:issue:`13483`)
Allow :class:`Index` to be passed to the :func:`numpy.all` function (:issue:`40180`)
Bug in :meth:`DataFrame.stack` not preserving CategoricalDtype in a MultiIndex (:issue:`36991`)
Bug in :func:`to_datetime` raising error when input sequence contains unhashable items (:issue:`39756`)
Bug in :meth:`Series.explode` preserving index when ignore_index was True and values were scalars (:issue:`40487`)
Bug in :func:`to_datetime` raising ValueError when :class:`Series` contains None and NaT and has more than 50 elements (:issue:`39882`)

Sparse

Bug in :meth:`DataFrame.sparse.to_coo` raising KeyError with columns that are a numeric :class:`Index` without a 0 (:issue:`18414`)
Bug in :meth:`SparseArray.astype` with copy=False producing incorrect results when going from integer dtype to floating dtype (:issue:`34456`)
Implemented :meth:`SparseArray.max` and :meth:`SparseArray.min` (:issue:`40921`)

ExtensionArray

Bug in :meth:`DataFrame.where` when other is a :class:`Series` with :class:`ExtensionArray` dtype (:issue:`38729`)
Fixed bug where :meth:`Series.idxmax`, :meth:`Series.idxmin` and argmax/min fail when the underlying data is :class:`ExtensionArray` (:issue:`32749`, :issue:`33719`, :issue:`36566`)
Fixed a bug where some properties of subclasses of :class:`PandasExtensionDtype` where improperly cached (:issue:`40329`)
Bug in :meth:`DataFrame.mask` where masking a :class:`Dataframe` with an :class:`ExtensionArray` dtype raises ValueError (:issue:`40941`)

Styler

Bug in :class:`Styler` where subset arg in methods raised an error for some valid multiindex slices (:issue:`33562`)
:class:`Styler` rendered HTML output minor alterations to support w3 good code standard (:issue:`39626`)
Bug in :class:`Styler` where rendered HTML was missing a column class identifier for certain header cells (:issue:`39716`)
Bug in :meth:`Styler.background_gradient` where text-color was not determined correctly (:issue:`39888`)
Bug in :class:`Styler` where multiple elements in CSS-selectors were not correctly added to table_styles (:issue:`39942`)
Bug in :class:`.Styler` where copying from Jupyter dropped top left cell and misaligned headers (:issue:`12147`)
Bug in :class:`.Styler.where` where kwargs were not passed to the applicable callable (:issue:`40845`)
Bug in :class:`Styler` which caused CSS to duplicate on multiple renders. (:issue:`39395`, :issue:`40334`)

Other

Bug in :class:`Index` constructor sometimes silently ignoring a specified dtype (:issue:`38879`)
Bug in :func:`pandas.api.types.infer_dtype` not recognizing Series, Index or array with a period dtype (:issue:`23553`)
Bug in :func:`pandas.api.types.infer_dtype` raising an error for general :class:`.ExtensionArray` objects. It will now return "unknown-array" instead of raising (:issue:`37367`)
Bug in constructing a :class:`Series` from a list and a :class:`PandasDtype` (:issue:`39357`)
inspect.getmembers(Series) no longer raises an AbstractMethodError (:issue:`38782`)
Bug in :meth:`Series.where` with numeric dtype and other = None not casting to nan (:issue:`39761`)
:meth:`Index.where` behavior now mirrors :meth:`Index.putmask` behavior, i.e. index.where(mask, other) matches index.putmask(~mask, other) (:issue:`39412`)
Bug in :func:`pandas.testing.assert_series_equal`, :func:`pandas.testing.assert_frame_equal`, :func:`pandas.testing.assert_index_equal` and :func:`pandas.testing.assert_extension_array_equal` incorrectly raising when an attribute has an unrecognized NA type (:issue:`39461`)
Bug in :func:`pandas.testing.assert_index_equal` with exact=True not raising when comparing :class:`CategoricalIndex` instances with Int64Index and RangeIndex categories (:issue:`41263`)
Bug in :meth:`DataFrame.equals`, :meth:`Series.equals`, :meth:`Index.equals` with object-dtype containing np.datetime64("NaT") or np.timedelta64("NaT") (:issue:`39650`)
Bug in :func:`pandas.util.show_versions` where console JSON output was not proper JSON (:issue:`39701`)
Let Pandas compile on z/OS when using xlc (:issue:`35826`)
Bug in :meth:`DataFrame.convert_dtypes` incorrectly raised ValueError when called on an empty DataFrame (:issue:`40393`)
Bug in :meth:`DataFrame.agg()` not sorting the aggregated axis in the order of the provided aggragation functions when one or more aggregation function fails to produce results (:issue:`33634`)
Bug in :meth:`DataFrame.clip` not interpreting missing values as no threshold (:issue:`40420`)
Bug in :class:`Series` backed by :class:`DatetimeArray` or :class:`TimedeltaArray` sometimes failing to set the array's freq to None (:issue:`41425`)
Bug in creating a :class:`Series` from a range object that does not fit in the bounds of int64 dtype (:issue:`30173`)

Files

v1.3.0.rst

Latest commit

History

v1.3.0.rst

File metadata and controls

What's new in 1.3.0 (??)

Enhancements

Custom HTTP(s) headers when reading csv or json files

Read and write XML documents

Styler Upgrades

DataFrame constructor honors copy=False with dict

Centered Datetime-Like Rolling Windows

Other enhancements

Notable bug fixes

Categorical.unique now always maintains same dtype as original

Preserve dtypes in :meth:`~pandas.DataFrame.combine_first`

Group by methods agg and transform no longer changes return dtype for callables

float result for :meth:`.GroupBy.mean`, :meth:`.GroupBy.median`, and :meth:`.GroupBy.var`

Try operating inplace when setting values with loc and iloc

Never Operate Inplace When Setting frame[keys] = values

Consistent Casting With Setting Into Boolean Series

GroupBy.rolling no longer returns grouped-by column in values

Removed artificial truncation in rolling variance and standard deviation

GroupBy.rolling with MultiIndex no longer drops levels in the result

Increased minimum versions for dependencies

Other API changes

Build

Deprecations

Deprecated Dropping Nuisance Columns in DataFrame Reductions and DataFrameGroupBy Operations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Period

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Styler

Other

Contributors

DataFrame constructor honors `copy=False` with dict

`Categorical.unique` now always maintains same dtype as original

`float` result for :meth:`.GroupBy.mean`, :meth:`.GroupBy.median`, and :meth:`.GroupBy.var`

Try operating inplace when setting values with `loc` and `iloc`

Never Operate Inplace When Setting `frame[keys] = values`