These are the changes in pandas 1.3.0. See :ref:`release` for a full changelog including other versions of pandas.
{{ header }}
Warning
When reading new Excel 2007+ (.xlsx
) files, the default argument
engine=None
to :func:`~pandas.read_excel` will now result in using the
openpyxl engine in all cases
when the option :attr:`io.excel.xlsx.reader` is set to "auto"
.
Previously, some cases would use the
xlrd engine instead. See
:ref:`What's new 1.2.0 <whatsnew_120>` for background on this change.
When reading from a remote URL that is not handled by fsspec (ie. HTTP and
HTTPS) the dictionary passed to storage_options
will be used to create the
headers included in the request. This can be used to control the User-Agent
header or send other custom headers (:issue:`36688`).
For example:
.. ipython:: python headers = {"User-Agent": "pandas"} df = pd.read_csv( "https://download.bls.gov/pub/time.series/cu/cu.item", sep="\t", storage_options=headers )
We added I/O support to read and render shallow versions of XML documents with :func:`pandas.read_xml` and :meth:`DataFrame.to_xml`. Using lxml as parser, both XPath 1.0 and XSLT 1.0 is available. (:issue:`27554`)
In [1]: xml = """<?xml version='1.0' encoding='utf-8'?>
...: <data>
...: <row>
...: <shape>square</shape>
...: <degrees>360</degrees>
...: <sides>4.0</sides>
...: </row>
...: <row>
...: <shape>circle</shape>
...: <degrees>360</degrees>
...: <sides/>
...: </row>
...: <row>
...: <shape>triangle</shape>
...: <degrees>180</degrees>
...: <sides>3.0</sides>
...: </row>
...: </data>"""
In [2]: df = pd.read_xml(xml)
In [3]: df
Out[3]:
shape degrees sides
0 square 360 4.0
1 circle 360 NaN
2 triangle 180 3.0
In [4]: df.to_xml()
Out[4]:
<?xml version='1.0' encoding='utf-8'?>
<data>
<row>
<index>0</index>
<shape>square</shape>
<degrees>360</degrees>
<sides>4.0</sides>
</row>
<row>
<index>1</index>
<shape>circle</shape>
<degrees>360</degrees>
<sides/>
</row>
<row>
<index>2</index>
<shape>triangle</shape>
<degrees>180</degrees>
<sides>3.0</sides>
</row>
</data>
For more, see :ref:`io.xml` in the user guide on IO tools.
We provided some focused development on :class:`.Styler`, including altering methods
to accept more universal CSS language for arguments, such as 'color:red;'
instead of
[('color', 'red')]
(:issue:`39564`). This is also added to the built-in methods
to allow custom CSS highlighting instead of default background coloring (:issue:`40242`).
Enhancements to other built-in methods include extending the :meth:`.Styler.background_gradient`
method to shade elements based on a given gradient map and not be restricted only to
values in the DataFrame (:issue:`39930` :issue:`22727` :issue:`28901`). Additional
built-in methods such as :meth:`.Styler.highlight_between`, :meth:`.Styler.highlight_quantile`
and .Styler.text_gradient have been added (:issue:`39821`, :issue:`40926`, :issue:`41098`).
The :meth:`.Styler.apply` now consistently allows functions with ndarray
output to
allow more flexible development of UDFs when axis
is None
0
or 1
(:issue:`39393`).
:meth:`.Styler.set_tooltips` is a new method that allows adding on hover tooltips to enhance interactive displays (:issue:`35643`). :meth:`.Styler.set_td_classes`, which was recently introduced in v1.2.0 (:issue:`36159`) to allow adding specific CSS classes to data cells, has been made as performant as :meth:`.Styler.apply` and :meth:`.Styler.applymap` (:issue:`40453`), if not more performant in some cases. The overall performance of HTML render times has been considerably improved to match :meth:`DataFrame.to_html` (:issue:`39952` :issue:`37792` :issue:`40425`).
The :meth:`.Styler.format` has had upgrades to easily format missing data, precision, and perform HTML escaping (:issue:`40437` :issue:`40134`). There have been numerous other bug fixes to properly format HTML and eliminate some inconsistencies (:issue:`39942` :issue:`40356` :issue:`39807` :issue:`39889` :issue:`39627`)
:class:`.Styler` has also been compatible with non-unique index or columns, at least for as many features as are fully compatible, others made only partially compatible (:issue:`41269`). One also has greater control of the display through separate sparsification of the index or columns, using the new 'styler' options context (:issue:`41142`).
We have added an extension to allow LaTeX styling as an alternative to CSS styling and a method :meth:`.Styler.to_latex` which renders the necessary LaTeX format including built-up styles. An additional file io function :meth:`Styler.to_html` has been added for convenience (:issue:`40312`).
Documentation has also seen major revisions in light of new features (:issue:`39720` :issue:`39317` :issue:`40493`)
When passing a dictionary to :class:`DataFrame` with copy=False
,
a copy will no longer be made (:issue:`32960`)
.. ipython:: python arr = np.array([1, 2, 3]) df = pd.DataFrame({"A": arr, "B": arr.copy()}, copy=False) df
df["A"]
remains a view on arr
:
.. ipython:: python arr[0] = 0 assert df.iloc[0, 0] == 0
The default behavior when not passing copy
will remain unchanged, i.e.
a copy will be made.
When performing rolling calculations on :class:`DataFrame` and :class:`Series` objects with a datetime-like index, a centered datetime-like window can now be used (:issue:`38780`). For example:
.. ipython:: python df = pd.DataFrame( {"A": [0, 1, 2, 3, 4]}, index=pd.date_range("2020", periods=5, freq="1D") ) df df.rolling("2D", center=True).mean()
- :class:`Rolling` and :class:`Expanding` now support a
method
argument with a'table'
option that performs the windowing operation over an entire :class:`DataFrame`. See ref:window.overview for performance and functional benefits (:issue:`15095`, :issue:`38995`) - Added :meth:`MultiIndex.dtypes` (:issue:`37062`)
- Added
end
andend_day
options fororigin
in :meth:`DataFrame.resample` (:issue:`37804`) - Improve error message when
usecols
andnames
do not match for :func:`read_csv` andengine="c"
(:issue:`29042`) - Improved consistency of error message when passing an invalid
win_type
argument in :class:`Window` (:issue:`15969`) - :func:`pandas.read_sql_query` now accepts a
dtype
argument to cast the columnar data from the SQL database based on user input (:issue:`10285`) - Improved integer type mapping from pandas to SQLAlchemy when using :meth:`DataFrame.to_sql` (:issue:`35076`)
- :func:`to_numeric` now supports downcasting of nullable
ExtensionDtype
objects (:issue:`33013`) - Add support for dict-like names in :class:`MultiIndex.set_names` and :class:`MultiIndex.rename` (:issue:`20421`)
- :func:`pandas.read_excel` can now auto detect .xlsb files and older .xls files (:issue:`35416`, :issue:`41225`)
- :class:`pandas.ExcelWriter` now accepts an
if_sheet_exists
parameter to control the behaviour of append mode when writing to existing sheets (:issue:`40230`) - :meth:`.Rolling.sum`, :meth:`.Expanding.sum`, :meth:`.Rolling.mean`, :meth:`.Expanding.mean`, :meth:`.ExponentialMovingWindow.mean`, :meth:`.Rolling.median`, :meth:`.Expanding.median`, :meth:`.Rolling.max`, :meth:`.Expanding.max`, :meth:`.Rolling.min`, and :meth:`.Expanding.min` now support
Numba
execution with theengine
keyword (:issue:`38895`, :issue:`41267`) - :meth:`DataFrame.apply` can now accept NumPy unary operators as strings, e.g.
df.apply("sqrt")
, which was already the case for :meth:`Series.apply` (:issue:`39116`) - :meth:`DataFrame.apply` can now accept non-callable DataFrame properties as strings, e.g.
df.apply("size")
, which was already the case for :meth:`Series.apply` (:issue:`39116`) - :meth:`DataFrame.applymap` can now accept kwargs to pass on to func (:issue:`39987`)
- Disallow :class:`DataFrame` indexer for
iloc
for :meth:`Series.__getitem__` and :meth:`DataFrame.__getitem__`, (:issue:`39004`) - :meth:`Series.apply` can now accept list-like or dictionary-like arguments that aren't lists or dictionaries, e.g.
ser.apply(np.array(["sum", "mean"]))
, which was already the case for :meth:`DataFrame.apply` (:issue:`39140`) - :meth:`DataFrame.plot.scatter` can now accept a categorical column as the argument to
c
(:issue:`12380`, :issue:`31357`) - :meth:`.Styler.set_tooltips` allows on hover tooltips to be added to styled HTML dataframes (:issue:`35643`, :issue:`21266`, :issue:`39317`, :issue:`39708`, :issue:`40284`)
- :meth:`.Styler.set_table_styles` amended to optionally allow certain css-string input arguments (:issue:`39564`)
- :meth:`.Styler.apply` now more consistently accepts ndarray function returns, i.e. in all cases for
axis
is0, 1 or None
(:issue:`39359`) - :meth:`.Styler.apply` and :meth:`.Styler.applymap` now raise errors if wrong format CSS is passed on render (:issue:`39660`)
- :meth:`.Styler.format` adds keyword argument
escape
for optional HTML escaping (:issue:`40437`) - :meth:`.Styler.background_gradient` now allows the ability to supply a specific gradient map (:issue:`22727`)
- :meth:`.Styler.clear` now clears :attr:`Styler.hidden_index` and :attr:`Styler.hidden_columns` as well (:issue:`40484`)
- Builtin highlighting methods in :class:`Styler` have a more consistent signature and css customisability (:issue:`40242`)
- :meth:`.Styler.highlight_between` added to list of builtin styling methods (:issue:`39821`)
- :meth:`Series.loc.__getitem__` and :meth:`Series.loc.__setitem__` with :class:`MultiIndex` now raising helpful error message when indexer has too many dimensions (:issue:`35349`)
- :meth:`pandas.read_stata` and :class:`StataReader` support reading data from compressed files.
- Add support for parsing
ISO 8601
-like timestamps with negative signs to :meth:`pandas.Timedelta` (:issue:`37172`) - Add support for unary operators in :class:`FloatingArray` (:issue:`38749`)
- :class:`RangeIndex` can now be constructed by passing a
range
object directly e.g.pd.RangeIndex(range(3))
(:issue:`12067`) - :meth:`round` being enabled for the nullable integer and floating dtypes (:issue:`38844`)
- :meth:`pandas.read_csv` and :meth:`pandas.read_json` expose the argument
encoding_errors
to control how encoding errors are handled (:issue:`39450`) - :meth:`.GroupBy.any` and :meth:`.GroupBy.all` use Kleene logic with nullable data types (:issue:`37506`)
- :meth:`.GroupBy.any` and :meth:`.GroupBy.all` return a
BooleanDtype
for columns with nullable data types (:issue:`33449`) - :meth:`.GroupBy.rank` now supports object-dtype data (:issue:`38278`)
- Constructing a :class:`DataFrame` or :class:`Series` with the
data
argument being a Python iterable that is not a NumPyndarray
consisting of NumPy scalars will now result in a dtype with a precision the maximum of the NumPy scalars; this was already the case whendata
is a NumPyndarray
(:issue:`40908`) - Add keyword
sort
to :func:`pivot_table` to allow non-sorting of the result (:issue:`39143`) - Add keyword
dropna
to :meth:`DataFrame.value_counts` to allow counting rows that includeNA
values (:issue:`41325`) - :meth:`Series.replace` will now cast results to
PeriodDtype
where possible instead ofobject
dtype (:issue:`41526`)
These are bug fixes that might have notable behavior changes.
Previously, when calling :meth:`~Categorical.unique` with categorical data, unused categories in the new array would be removed, meaning that the dtype of the new array would be different than the original, if some categories are not present in the unique array (:issue:`18291`)
As an example of this, given:
.. ipython:: python dtype = pd.CategoricalDtype(['bad', 'neutral', 'good'], ordered=True) cat = pd.Categorical(['good', 'good', 'bad', 'bad'], dtype=dtype) original = pd.Series(cat) unique = original.unique()
pandas < 1.3.0:
In [1]: unique
['good', 'bad']
Categories (2, object): ['bad' < 'good']
In [2]: original.dtype == unique.dtype
False
pandas >= 1.3.0
.. ipython:: python unique original.dtype == unique.dtype
Preserve dtypes in :meth:`~pandas.DataFrame.combine_first`
:meth:`~pandas.DataFrame.combine_first` will now preserve dtypes (:issue:`7509`)
.. ipython:: python df1 = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=[0, 1, 2]) df1 df2 = pd.DataFrame({"B": [4, 5, 6], "C": [1, 2, 3]}, index=[2, 3, 4]) df2 combined = df1.combine_first(df2)
pandas 1.2.x
In [1]: combined.dtypes
Out[2]:
A float64
B float64
C float64
dtype: object
pandas 1.3.0
.. ipython:: python combined.dtypes
Previously the methods :meth:`.DataFrameGroupBy.aggregate`,
:meth:`.SeriesGroupBy.aggregate`, :meth:`.DataFrameGroupBy.transform`, and
:meth:`.SeriesGroupBy.transform` might cast the result dtype when the argument func
is callable, possibly leading to undesirable results (:issue:`21240`). The cast would
occur if the result is numeric and casting back to the input dtype does not change any
values as measured by np.allclose
. Now no such casting occurs.
.. ipython:: python df = pd.DataFrame({'key': [1, 1], 'a': [True, False], 'b': [True, True]}) df
pandas 1.2.x
In [5]: df.groupby('key').agg(lambda x: x.sum())
Out[5]:
a b
key
1 True 2
pandas 1.3.0
.. ipython:: python df.groupby('key').agg(lambda x: x.sum())
float
result for :meth:`.GroupBy.mean`, :meth:`.GroupBy.median`, and :meth:`.GroupBy.var`
Previously, these methods could result in different dtypes depending on the input values. Now, these methods will always return a float dtype. (:issue:`41137`)
.. ipython:: python df = pd.DataFrame({'a': [True], 'b': [1], 'c': [1.0]})
pandas 1.2.x
In [5]: df.groupby(df.index).mean()
Out[5]:
a b c
0 True 1 1.0
pandas 1.3.0
.. ipython:: python df.groupby(df.index).mean()
When setting an entire column using loc
or iloc
, pandas will try to
insert the values into the existing data rather than create an entirely new array.
.. ipython:: python df = pd.DataFrame(range(3), columns=["A"], dtype="float64") values = df.values new = np.array([5, 6, 7], dtype="int64") df.loc[[0, 1, 2], "A"] = new
In both the new and old behavior, the data in values
is overwritten, but in
the old behavior the dtype of df["A"]
changed to int64
.
pandas 1.2.x
In [1]: df.dtypes
Out[1]:
A int64
dtype: object
In [2]: np.shares_memory(df["A"].values, new)
Out[2]: False
In [3]: np.shares_memory(df["A"].values, values)
Out[3]: False
In pandas 1.3.0, df
continues to share data with values
pandas 1.3.0
.. ipython:: python df.dtypes np.shares_memory(df["A"], new) np.shares_memory(df["A"], values)
When setting multiple columns using frame[keys] = values
new arrays will
replace pre-existing arrays for these keys, which will not be over-written
(:issue:`39510`). As a result, the columns will retain the dtype(s) of values
,
never casting to the dtypes of the existing arrays.
.. ipython:: python df = pd.DataFrame(range(3), columns=["A"], dtype="float64") df[["A"]] = 5
In the old behavior, 5
was cast to float64
and inserted into the existing
array backing df
:
pandas 1.2.x
In [1]: df.dtypes
Out[1]:
A float64
In the new behavior, we get a new array, and retain an integer-dtyped 5
:
pandas 1.3.0
.. ipython:: python df.dtypes
Setting non-boolean values into a :class:`Series with ``dtype=bool`` consistently
cast to dtype=object
(:issue:`38709`)
.. ipython:: python orig = pd.Series([True, False]) ser = orig.copy() ser.iloc[1] = np.nan ser2 = orig.copy() ser2.iloc[1] = 2.0
pandas 1.2.x
In [1]: ser
Out [1]:
0 1.0
1 NaN
dtype: float64
In [2]:ser2
Out [2]:
0 True
1 2.0
dtype: object
pandas 1.3.0
.. ipython:: python ser ser2
The group-by column will now be dropped from the result of a
groupby.rolling
operation (:issue:`32262`)
.. ipython:: python df = pd.DataFrame({"A": [1, 1, 2, 3], "B": [0, 1, 2, 3]}) df
Previous behavior:
In [1]: df.groupby("A").rolling(2).sum()
Out[1]:
A B
A
1 0 NaN NaN
1 2.0 1.0
2 2 NaN NaN
3 3 NaN NaN
New behavior:
.. ipython:: python df.groupby("A").rolling(2).sum()
:meth:`core.window.Rolling.std` and :meth:`core.window.Rolling.var` will no longer
artificially truncate results that are less than ~1e-8
and ~1e-15
respectively to
zero (:issue:`37051`, :issue:`40448`, :issue:`39872`).
However, floating point artifacts may now exist in the results when rolling over larger values.
.. ipython:: python s = pd.Series([7, 5, 5, 5]) s.rolling(3).var()
:class:`core.window.rolling.RollingGroupby` will no longer drop levels of a :class:`DataFrame` with a :class:`MultiIndex` in the result. This can lead to a perceived duplication of levels in the resulting :class:`MultiIndex`, but this change restores the behavior that was present in version 1.1.3 (:issue:`38787`, :issue:`38523`).
.. ipython:: python index = pd.MultiIndex.from_tuples([('idx1', 'idx2')], names=['label1', 'label2']) df = pd.DataFrame({'a': [1], 'b': [2]}, index=index) df
Previous behavior:
In [1]: df.groupby('label1').rolling(1).sum()
Out[1]:
a b
label1
idx1 1.0 2.0
New behavior:
.. ipython:: python df.groupby('label1').rolling(1).sum()
Some minimum supported versions of dependencies were updated. If installed, we now require:
Package | Minimum Version | Required | Changed |
---|---|---|---|
numpy | 1.17.3 | X | X |
pytz | 2017.3 | X | |
python-dateutil | 2.7.3 | X | |
bottleneck | 1.2.1 | ||
numexpr | 2.7.0 | X | |
pytest (dev) | 6.0 | X | |
mypy (dev) | 0.800 | X | |
setuptools | 38.6.0 | X |
For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.
Package | Minimum Version | Changed |
---|---|---|
beautifulsoup4 | 4.6.0 | |
fastparquet | 0.4.0 | X |
fsspec | 0.7.4 | |
gcsfs | 0.6.0 | |
lxml | 4.3.0 | |
matplotlib | 2.2.3 | |
numba | 0.46.0 | |
openpyxl | 3.0.0 | X |
pyarrow | 0.17.0 | X |
pymysql | 0.8.1 | X |
pytables | 3.5.1 | |
s3fs | 0.4.0 | |
scipy | 1.2.0 | |
sqlalchemy | 1.3.0 | X |
tabulate | 0.8.7 | X |
xarray | 0.12.0 | |
xlrd | 1.2.0 | |
xlsxwriter | 1.0.2 | |
xlwt | 1.3.0 | |
pandas-gbq | 0.12.0 |
See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.
- Partially initialized :class:`CategoricalDtype` (i.e. those with
categories=None
objects will no longer compare as equal to fully initialized dtype objects. - Accessing
_constructor_expanddim
on a :class:`DataFrame` and_constructor_sliced
on a :class:`Series` now raise anAttributeError
. Previously aNotImplementedError
was raised (:issue:`38782`) - Added new
engine
and**engine_kwargs
parameters to :meth:`DataFrame.to_sql` to support other future "SQL engines". Currently we still only useSQLAlchemy
under the hood, but more engines are planned to be supported such asturbodbc
(:issue:`36893`)
- Documentation in
.pptx
and.pdf
formats are no longer included in wheels or source distributions. (:issue:`30741`)
- Deprecated allowing scalars to be passed to the :class:`Categorical` constructor (:issue:`38433`)
- Deprecated constructing :class:`CategoricalIndex` without passing list-like data (:issue:`38944`)
- Deprecated allowing subclass-specific keyword arguments in the :class:`Index` constructor, use the specific subclass directly instead (:issue:`14093`, :issue:`21311`, :issue:`22315`, :issue:`26974`)
- Deprecated
astype
of datetimelike (timedelta64[ns]
,datetime64[ns]
,Datetime64TZDtype
,PeriodDtype
) to integer dtypes, usevalues.view(...)
instead (:issue:`38544`) - Deprecated :meth:`MultiIndex.is_lexsorted` and :meth:`MultiIndex.lexsort_depth`, use :meth:`MultiIndex.is_monotonic_increasing` instead (:issue:`32259`)
- Deprecated keyword
try_cast
in :meth:`Series.where`, :meth:`Series.mask`, :meth:`DataFrame.where`, :meth:`DataFrame.mask`; cast results manually if desired (:issue:`38836`) - Deprecated comparison of :class:`Timestamp` object with
datetime.date
objects. Instead of e.g.ts <= mydate
usets <= pd.Timestamp(mydate)
orts.date() <= mydate
(:issue:`36131`) - Deprecated :attr:`Rolling.win_type` returning
"freq"
(:issue:`38963`) - Deprecated :attr:`Rolling.is_datetimelike` (:issue:`38963`)
- Deprecated :class:`DataFrame` indexer for :meth:`Series.__setitem__` and :meth:`DataFrame.__setitem__` (:issue:`39004`)
- Deprecated :meth:`core.window.ewm.ExponentialMovingWindow.vol` (:issue:`39220`)
- Using
.astype
to convert betweendatetime64[ns]
dtype and :class:`DatetimeTZDtype` is deprecated and will raise in a future version, useobj.tz_localize
orobj.dt.tz_localize
instead (:issue:`38622`) - Deprecated casting
datetime.date
objects todatetime64
when used asfill_value
in :meth:`DataFrame.unstack`, :meth:`DataFrame.shift`, :meth:`Series.shift`, and :meth:`DataFrame.reindex`, passpd.Timestamp(dateobj)
instead (:issue:`39767`) - Deprecated :meth:`.Styler.set_na_rep` and :meth:`.Styler.set_precision` in favour of :meth:`.Styler.format` with
na_rep
andprecision
as existing and new input arguments respectively (:issue:`40134`, :issue:`40425`) - Deprecated allowing partial failure in :meth:`Series.transform` and :meth:`DataFrame.transform` when
func
is list-like or dict-like and raises anything butTypeError
;func
raising anything but aTypeError
will raise in a future version (:issue:`40211`) - Deprecated arguments
error_bad_lines
andwarn_bad_lines
in :meth:read_csv
and :meth:read_table
in favor of argumenton_bad_lines
(:issue:`15122`) - Deprecated support for
np.ma.mrecords.MaskedRecords
in the :class:`DataFrame` constructor, pass{name: data[name] for name in data.dtype.names}
instead (:issue:`40363`) - Deprecated using :func:`merge` or :func:`join` on a different number of levels (:issue:`34862`)
- Deprecated the use of
**kwargs
in :class:`.ExcelWriter`; use the keyword argumentengine_kwargs
instead (:issue:`40430`) - Deprecated the
level
keyword for :class:`DataFrame` and :class:`Series` aggregations; use groupby instead (:issue:`39983`) - The
inplace
parameter of :meth:`Categorical.remove_categories`, :meth:`Categorical.add_categories`, :meth:`Categorical.reorder_categories`, :meth:`Categorical.rename_categories`, :meth:`Categorical.set_categories` is deprecated and will be removed in a future version (:issue:`37643`) - Deprecated :func:`merge` producing duplicated columns through the
suffixes
keyword and already existing columns (:issue:`22818`) - Deprecated setting :attr:`Categorical._codes`, create a new :class:`Categorical` with the desired codes instead (:issue:`40606`)
- Deprecated the
convert_float
optional argument in :func:`read_excel` and :meth:`ExcelFile.parse` (:issue:`41127`) - Deprecated behavior of :meth:`DatetimeIndex.union` with mixed timezones; in a future version both will be cast to UTC instead of object dtype (:issue:`39328`)
- Deprecated using
usecols
with out of bounds indices forread_csv
withengine="c"
(:issue:`25623`) - Deprecated passing arguments as positional (except for
"codes"
) in :meth:`MultiIndex.codes` (:issue:`41485`) - Deprecated passing arguments as positional in :meth:`Index.set_names` and :meth:`MultiIndex.set_names` (except for
names
) (:issue:`41485`) - Deprecated passing arguments (apart from
cond
andother
) as positional in :meth:`DataFrame.mask` and :meth:`Series.mask` (:issue:`41485`) - Deprecated passing arguments as positional in :meth:`DataFrame.clip` and :meth:`Series.clip` (other than
"upper"
and"lower"
) (:issue:`41485`) - Deprecated special treatment of lists with first element a Categorical in the :class:`DataFrame` constructor; pass as
pd.DataFrame({col: categorical, ...})
instead (:issue:`38845`) - Deprecated passing arguments as positional (except for
"method"
) in :meth:`DataFrame.interpolate` and :meth:`Series.interpolate` (:issue:`41485`) - Deprecated passing arguments as positional in :meth:`DataFrame.ffill`, :meth:`Series.ffill`, :meth:`DataFrame.bfill`, and :meth:`Series.bfill` (:issue:`41485`)
- Deprecated passing arguments as positional in :meth:`DataFrame.sort_values` (other than
"by"
) and :meth:`Series.sort_values` (:issue:`41485`) - Deprecated passing arguments as positional in :meth:`DataFrame.dropna` and :meth:`Series.dropna` (:issue:`41485`)
- Deprecated passing arguments as positional in :meth:`DataFrame.set_index` (other than
"keys"
) (:issue:`41485`) - Deprecated passing arguments as positional (except for
"levels"
) in :meth:`MultiIndex.set_levels` (:issue:`41485`) - Deprecated passing arguments as positional in :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` (:issue:`41485`)
- Deprecated passing arguments as positional in :meth:`DataFrame.drop_duplicates` (except for
subset
), :meth:`Series.drop_duplicates`, :meth:`Index.drop_duplicates` and :meth:`MultiIndex.drop_duplicates`(:issue:`41485`) - Deprecated passing arguments (apart from
value
) as positional in :meth:`DataFrame.fillna` and :meth:`Series.fillna` (:issue:`41485`) - Deprecated passing arguments as positional in :meth:`DataFrame.reset_index` (other than
"level"
) and :meth:`Series.reset_index` (:issue:`41485`) - Deprecated construction of :class:`Series` or :class:`DataFrame` with
DatetimeTZDtype
data anddatetime64[ns]
dtype. UseSeries(data).dt.tz_localize(None)
instead (:issue:`41555`,:issue:33401) - Deprecated passing arguments as positional in :meth:`DataFrame.set_axis` and :meth:`Series.set_axis` (other than
"labels"
) (:issue:`41485`) - Deprecated passing arguments as positional in :meth:`DataFrame.where` and :meth:`Series.where` (other than
"cond"
and"other"
) (:issue:`41485`) - Deprecated passing arguments as positional (other than
filepath_or_buffer
) in :func:`read_csv` (:issue:`41485`) - Deprecated passing arguments as positional in :meth:`DataFrame.drop` (other than
"labels"
) and :meth:`Series.drop` (:issue:`41485`)
The default of calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with
numeric_only=None
(the default, columns on which the reduction raises TypeError
are silently ignored and dropped from the result.
This behavior is deprecated. In a future version, the TypeError
will be raised,
and users will need to select only valid columns before calling the function.
For example:
.. ipython:: python df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)}) df
Old behavior:
In [3]: df.prod()
Out[3]:
Out[3]:
A 24
dtype: int64
Future behavior:
In [4]: df.prod()
...
TypeError: 'DatetimeArray' does not implement reduction 'prod'
In [5]: df[["A"]].prod()
Out[5]:
A 24
dtype: int64
Similarly, when applying a function to :class:`DataFrameGroupBy`, columns on which
the function raises TypeError
are currently silently ignored and dropped
from the result.
This behavior is deprecated. In a future version, the TypeError
will be raised, and users will need to select only valid columns before calling
the function.
For example:
.. ipython:: python df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)}) gb = df.groupby([1, 1, 2, 2])
Old behavior:
In [4]: gb.prod(numeric_only=False)
Out[4]:
A
1 2
2 12
In [5]: gb.prod(numeric_only=False)
...
TypeError: datetime64 type does not support prod operations
In [6]: gb[["A"]].prod(numeric_only=False)
Out[6]:
A
1 2
2 12
- Performance improvement in :meth:`IntervalIndex.isin` (:issue:`38353`)
- Performance improvement in :meth:`Series.mean` for nullable data types (:issue:`34814`)
- Performance improvement in :meth:`Series.isin` for nullable data types (:issue:`38340`)
- Performance improvement in :meth:`DataFrame.fillna` with
method="pad|backfill"
for nullable floating and nullable integer dtypes (:issue:`39953`) - Performance improvement in :meth:`DataFrame.corr` for method=kendall (:issue:`28329`)
- Performance improvement in :meth:`core.window.rolling.Rolling.corr` and :meth:`core.window.rolling.Rolling.cov` (:issue:`39388`)
- Performance improvement in :meth:`core.window.rolling.RollingGroupby.corr`, :meth:`core.window.expanding.ExpandingGroupby.corr`, :meth:`core.window.expanding.ExpandingGroupby.corr` and :meth:`core.window.expanding.ExpandingGroupby.cov` (:issue:`39591`)
- Performance improvement in :func:`unique` for object data type (:issue:`37615`)
- Performance improvement in :func:`pd.json_normalize` for basic cases (including separators) (:issue:`40035` :issue:`15621`)
- Performance improvement in :class:`core.window.rolling.ExpandingGroupby` aggregation methods (:issue:`39664`)
- Performance improvement in :class:`Styler` where render times are more than 50% reduced (:issue:`39972` :issue:`39952`)
- Performance improvement in :meth:`core.window.ewm.ExponentialMovingWindow.mean` with
times
(:issue:`39784`) - Performance improvement in :meth:`.GroupBy.apply` when requiring the python fallback implementation (:issue:`40176`)
- Performance improvement in the conversion of pyarrow boolean array to a pandas nullable boolean array (:issue:`41051`)
- Performance improvement for concatenation of data with type :class:`CategoricalDtype` (:issue:`40193`)
- Performance improvement in :meth:`.GroupBy.cummin` and :meth:`.GroupBy.cummax` with nullable data types (:issue:`37493`)
- Performance improvement in :meth:`Series.nunique` with nan values (:issue:`40865`)
- Performance improvement in :meth:`DataFrame.transpose`, :meth:`Series.unstack` with
DatetimeTZDtype
(:issue:`40149`)
- Bug in :class:`CategoricalIndex` incorrectly failing to raise
TypeError
when scalar data is passed (:issue:`38614`) - Bug in
CategoricalIndex.reindex
failed whenIndex
passed with elements all in category (:issue:`28690`) - Bug where constructing a :class:`Categorical` from an object-dtype array of
date
objects did not round-trip correctly withastype
(:issue:`38552`) - Bug in constructing a :class:`DataFrame` from an
ndarray
and a :class:`CategoricalDtype` (:issue:`38857`) - Bug in :meth:`DataFrame.reindex` was throwing
IndexError
when new index contained duplicates and old index was :class:`CategoricalIndex` (:issue:`38906`) - Bug in setting categorical values into an object-dtype column in a :class:`DataFrame` (:issue:`39136`)
- Bug in :meth:`DataFrame.reindex` was raising
IndexError
when new index contained duplicates and old index was :class:`CategoricalIndex` (:issue:`38906`)
- Bug in :class:`DataFrame` and :class:`Series` constructors sometimes dropping nanoseconds from :class:`Timestamp` (resp. :class:`Timedelta`)
data
, withdtype=datetime64[ns]
(resp.timedelta64[ns]
) (:issue:`38032`) - Bug in :meth:`DataFrame.first` and :meth:`Series.first` returning two months for offset one month when first day is last calendar day (:issue:`29623`)
- Bug in constructing a :class:`DataFrame` or :class:`Series` with mismatched
datetime64
data andtimedelta64
dtype, or vice-versa, failing to raiseTypeError
(:issue:`38575`, :issue:`38764`, :issue:`38792`) - Bug in constructing a :class:`Series` or :class:`DataFrame` with a
datetime
object out of bounds fordatetime64[ns]
dtype or atimedelta
object out of bounds fortimedelta64[ns]
dtype (:issue:`38792`, :issue:`38965`) - Bug in :meth:`DatetimeIndex.intersection`, :meth:`DatetimeIndex.symmetric_difference`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38741`)
- Bug in :meth:`Series.where` incorrectly casting
datetime64
values toint64
(:issue:`37682`) - Bug in :class:`Categorical` incorrectly typecasting
datetime
object toTimestamp
(:issue:`38878`) - Bug in comparisons between :class:`Timestamp` object and
datetime64
objects just outside the implementation bounds for nanoseconddatetime64
(:issue:`39221`) - Bug in :meth:`Timestamp.round`, :meth:`Timestamp.floor`, :meth:`Timestamp.ceil` for values near the implementation bounds of :class:`Timestamp` (:issue:`39244`)
- Bug in :meth:`Timedelta.round`, :meth:`Timedelta.floor`, :meth:`Timedelta.ceil` for values near the implementation bounds of :class:`Timedelta` (:issue:`38964`)
- Bug in :func:`date_range` incorrectly creating :class:`DatetimeIndex` containing
NaT
instead of raisingOutOfBoundsDatetime
in corner cases (:issue:`24124`) - Bug in :func:`infer_freq` incorrectly fails to infer 'H' frequency of :class:`DatetimeIndex` if the latter has a timezone and crosses DST boundaries (:issue:`39556`)
- Bug in constructing :class:`Timedelta` from
np.timedelta64
objects with non-nanosecond units that are out of bounds fortimedelta64[ns]
(:issue:`38965`) - Bug in constructing a :class:`TimedeltaIndex` incorrectly accepting
np.datetime64("NaT")
objects (:issue:`39462`) - Bug in constructing :class:`Timedelta` from input string with only symbols and no digits failed to raise an error (:issue:`39710`)
- Bug in :class:`TimedeltaIndex` and :func:`to_timedelta` failing to raise when passed non-nanosecond
timedelta64
arrays that overflow when converting totimedelta64[ns]
(:issue:`40008`)
- Bug in different
tzinfo
objects representing UTC not being treated as equivalent (:issue:`39216`) - Bug in
dateutil.tz.gettz("UTC")
not being recognized as equivalent to other UTC-representing tzinfos (:issue:`39276`)
- Bug in :meth:`DataFrame.quantile`, :meth:`DataFrame.sort_values` causing incorrect subsequent indexing behavior (:issue:`38351`)
- Bug in :meth:`DataFrame.sort_values` raising an :class:`IndexError` for empty
by
(:issue:`40258`) - Bug in :meth:`DataFrame.select_dtypes` with
include=np.number
now retains numericExtensionDtype
columns (:issue:`35340`) - Bug in :meth:`DataFrame.mode` and :meth:`Series.mode` not keeping consistent integer :class:`Index` for empty input (:issue:`33321`)
- Bug in :meth:`DataFrame.rank` with
np.inf
and mixture ofnp.nan
andnp.inf
(:issue:`32593`) - Bug in :meth:`DataFrame.rank` with
axis=0
and columns holding incomparable types raisingIndexError
(:issue:`38932`) - Bug in
rank
method for :class:`Series`, :class:`DataFrame`, :class:`DataFrameGroupBy`, and :class:`SeriesGroupBy` treating the most negativeint64
value as missing (:issue:`32859`) - Bug in :func:`select_dtypes` different behavior between Windows and Linux with
include="int"
(:issue:`36569`) - Bug in :meth:`DataFrame.apply` and :meth:`DataFrame.agg` when passed argument
func="size"
would operate on the entireDataFrame
instead of rows or columns (:issue:`39934`) - Bug in :meth:`DataFrame.transform` would raise
SpecificationError
when passed a dictionary and columns were missing; will now raise aKeyError
instead (:issue:`40004`) - Bug in :meth:`DataFrameGroupBy.rank` giving incorrect results with
pct=True
and equal values between consecutive groups (:issue:`40518`) - Bug in :meth:`Series.count` would result in an
int32
result on 32-bit platforms when argumentlevel=None
(:issue:`40908`) - Bug in :class:`Series` and :class:`DataFrame` reductions with methods
any
andall
not returning boolean results for object data (:issue:`12863`, :issue:`35450`, :issue:`27709`) - Bug in :meth:`Series.clip` would fail if series contains NA values and has nullable int or float as a data type (:issue:`40851`)
- Bug in :meth:`Series.to_dict` with
orient='records'
now returns python native types (:issue:`25969`) - Bug in :meth:`Series.view` and :meth:`Index.view` when converting between datetime-like (
datetime64[ns]
,datetime64[ns, tz]
,timedelta64
,period
) dtypes (:issue:`39788`) - Bug in creating a :class:`DataFrame` from an empty
np.recarray
not retaining the original dtypes (:issue:`40121`) - Bug in :class:`DataFrame` failing to raise
TypeError
when constructing from afrozenset
(:issue:`40163`) - Bug in :class:`Index` construction silently ignoring a passed
dtype
when the data cannot be cast to that dtype (:issue:`21311`) - Bug in :meth:`StringArray.astype` falling back to numpy and raising when converting to
dtype='categorical'
(:issue:`40450`) - Bug in :func:`factorize` where, when given an array with a numeric numpy dtype lower than int64, uint64 and float64, the unique values did not keep their original dtype (:issue:`41132`)
- Bug in :class:`DataFrame` construction with a dictionary containing an arraylike with
ExtensionDtype
andcopy=True
failing to make a copy (:issue:`38939`) - Bug in :meth:`qcut` raising error when taking
Float64DType
as input (:issue:`40730`) - Bug in :class:`DataFrame` and :class:`Series` construction with
datetime64[ns]
data anddtype=object
resulting indatetime
objects instead of :class:`Timestamp` objects (:issue:`41599`) - Bug in :class:`DataFrame` and :class:`Series` construction with
timedelta64[ns]
data anddtype=object
resulting innp.timedelta64
objects instead of :class:`Timedelta` objects (:issue:`41599`)
- Bug in the conversion from
pyarrow.ChunkedArray
to :class:`~arrays.StringArray` when the original had zero chunks (:issue:`41040`) - Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` ignoring replacements with
regex=True
forStringDType
data (:issue:`41333`, :issue:`35977`) - Bug in :meth:`Series.str.extract` with :class:`~arrays.StringArray` returning object dtype for empty :class:`DataFrame` (:issue:`41441`)
- Bug in :meth:`Series.str.replace` where the
case
argument was ignored whenregex=False
(:issue:`41602`)
- Bug in :meth:`IntervalIndex.intersection` and :meth:`IntervalIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38653`, :issue:`38741`)
- Bug in :meth:`IntervalIndex.intersection` returning duplicates when at least one of both Indexes has duplicates which are present in the other (:issue:`38743`)
- :meth:`IntervalIndex.union`, :meth:`IntervalIndex.intersection`, :meth:`IntervalIndex.difference`, and :meth:`IntervalIndex.symmetric_difference` now cast to the appropriate dtype instead of raising
TypeError
when operating with another :class:`IntervalIndex` with incompatible dtype (:issue:`39267`) - :meth:`PeriodIndex.union`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference`, :meth:`PeriodIndex.difference` now cast to object dtype instead of raising
IncompatibleFrequency
when operating with another :class:`PeriodIndex` with incompatible dtype (:issue:`??`)
- Bug in :meth:`Index.union` and :meth:`MultiIndex.union` dropping duplicate
Index
values whenIndex
was not monotonic orsort
was set toFalse
(:issue:`36289`, :issue:`31326`, :issue:`40862`) - Bug in :meth:`CategoricalIndex.get_indexer` failing to raise
InvalidIndexError
when non-unique (:issue:`38372`) - Bug in :meth:`Series.loc` raising
ValueError
when input was filtered with a boolean list and values to set were a list with lower dimension (:issue:`20438`) - Bug in inserting many new columns into a :class:`DataFrame` causing incorrect subsequent indexing behavior (:issue:`38380`)
- Bug in :meth:`DataFrame.__setitem__` raising
ValueError
when setting multiple values to duplicate columns (:issue:`15695`) - Bug in :meth:`DataFrame.loc`, :meth:`Series.loc`, :meth:`DataFrame.__getitem__` and :meth:`Series.__getitem__` returning incorrect elements for non-monotonic :class:`DatetimeIndex` for string slices (:issue:`33146`)
- Bug in :meth:`DataFrame.reindex` and :meth:`Series.reindex` with timezone aware indexes raising
TypeError
formethod="ffill"
andmethod="bfill"
and specifiedtolerance
(:issue:`38566`) - Bug in :meth:`DataFrame.reindex` with
datetime64[ns]
ortimedelta64[ns]
incorrectly casting to integers when thefill_value
requires casting to object dtype (:issue:`39755`) - Bug in :meth:`DataFrame.__setitem__` raising
ValueError
with empty :class:`DataFrame` and specified columns for string indexer and non empty :class:`DataFrame` to set (:issue:`38831`) - Bug in :meth:`DataFrame.loc.__setitem__` raising ValueError when expanding unique column for :class:`DataFrame` with duplicate columns (:issue:`38521`)
- Bug in :meth:`DataFrame.iloc.__setitem__` and :meth:`DataFrame.loc.__setitem__` with mixed dtypes when setting with a dictionary value (:issue:`38335`)
- Bug in :meth:`Series.loc.__setitem__` and :meth:`DataFrame.loc.__setitem__` raising
KeyError
for boolean Iterator indexer (:issue:`39614`) - Bug in :meth:`Series.iloc` and :meth:`DataFrame.iloc` raising
KeyError
for Iterator indexer (:issue:`39614`) - Bug in :meth:`DataFrame.__setitem__` not raising
ValueError
when right hand side is a :class:`DataFrame` with wrong number of columns (:issue:`38604`) - Bug in :meth:`Series.__setitem__` raising
ValueError
when setting a :class:`Series` with a scalar indexer (:issue:`38303`) - Bug in :meth:`DataFrame.loc` dropping levels of :class:`MultiIndex` when :class:`DataFrame` used as input has only one row (:issue:`10521`)
- Bug in :meth:`DataFrame.__getitem__` and :meth:`Series.__getitem__` always raising
KeyError
when slicing with existing strings an :class:`Index` with milliseconds (:issue:`33589`) - Bug in setting
timedelta64
ordatetime64
values into numeric :class:`Series` failing to cast to object dtype (:issue:`39086`, issue:39619) - Bug in setting :class:`Interval` values into a :class:`Series` or :class:`DataFrame` with mismatched :class:`IntervalDtype` incorrectly casting the new values to the existing dtype (:issue:`39120`)
- Bug in setting
datetime64
values into a :class:`Series` with integer-dtype incorrect casting the datetime64 values to integers (:issue:`39266`) - Bug in setting
np.datetime64("NaT")
into a :class:`Series` with :class:`Datetime64TZDtype` incorrectly treating the timezone-naive value as timezone-aware (:issue:`39769`) - Bug in :meth:`Index.get_loc` not raising
KeyError
when method is specified forNaN
value whenNaN
is not in :class:`Index` (:issue:`39382`) - Bug in :meth:`DatetimeIndex.insert` when inserting
np.datetime64("NaT")
into a timezone-aware index incorrectly treating the timezone-naive value as timezone-aware (:issue:`39769`) - Bug in incorrectly raising in :meth:`Index.insert`, when setting a new column that cannot be held in the existing
frame.columns
, or in :meth:`Series.reset_index` or :meth:`DataFrame.reset_index` instead of casting to a compatible dtype (:issue:`39068`) - Bug in :meth:`RangeIndex.append` where a single object of length 1 was concatenated incorrectly (:issue:`39401`)
- Bug in :meth:`RangeIndex.astype` where when converting to :class:`CategoricalIndex`, the categories became a :class:`Int64Index` instead of a :class:`RangeIndex` (:issue:`41263`)
- Bug in setting
numpy.timedelta64
values into an object-dtype :class:`Series` using a boolean indexer (:issue:`39488`) - Bug in setting numeric values into a into a boolean-dtypes :class:`Series` using
at
oriat
failing to cast to object-dtype (:issue:`39582`) - Bug in :meth:`DataFrame.__setitem__` and :meth:`DataFrame.iloc.__setitem__` raising
ValueError
when trying to index with a row-slice and setting a list as values (:issue:`40440`) - Bug in :meth:`DataFrame.loc` not raising
KeyError
when key was not found in :class:`MultiIndex` when levels contain more values than used (:issue:`41170`) - Bug in :meth:`DataFrame.loc.__setitem__` when setting-with-expansion incorrectly raising when the index in the expanding axis contains duplicates (:issue:`40096`)
- Bug in :meth:`DataFrame.loc.__getitem__` with :class:`MultiIndex` casting to float when at least one column is from has float dtype and we retrieve a scalar (:issue:`41369`)
- Bug in :meth:`DataFrame.loc` incorrectly matching non-boolean index elements (:issue:`20432`)
- Bug in :meth:`Series.__delitem__` with
ExtensionDtype
incorrectly casting tondarray
(:issue:`40386`) - Bug in :meth:`DataFrame.loc` returning :class:`MultiIndex` in wrong order if indexer has duplicates (:issue:`40978`)
- Bug in :meth:`DataFrame.__setitem__` raising
TypeError
when using a str subclass as the column name with a :class:`DatetimeIndex` (:issue:`37366`)
- Bug in :class:`Grouper` now correctly propagates
dropna
argument and :meth:`DataFrameGroupBy.transform` now correctly handles missing values fordropna=True
(:issue:`35612`) - Bug in :func:`isna`, and :meth:`Series.isna`, :meth:`Index.isna`, :meth:`DataFrame.isna` (and the corresponding
notna
functions) not recognizingDecimal("NaN")
objects (:issue:`39409`) - Bug in :meth:`DataFrame.fillna` not accepting dictionary for
downcast
keyword (:issue:`40809`) - Bug in :func:`isna` not returning a copy of the mask for nullable types, causing any subsequent mask modification to change the original array (:issue:`40935`)
- Bug in :class:`DataFrame` construction with float data containing
NaN
and an integerdtype
casting instead of retaining theNaN
(:issue:`26919`)
- Bug in :meth:`DataFrame.drop` raising
TypeError
when :class:`MultiIndex` is non-unique andlevel
is not provided (:issue:`36293`) - Bug in :meth:`MultiIndex.intersection` duplicating
NaN
in result (:issue:`38623`) - Bug in :meth:`MultiIndex.equals` incorrectly returning
True
when :class:`MultiIndex` containingNaN
even when they are differently ordered (:issue:`38439`) - Bug in :meth:`MultiIndex.intersection` always returning empty when intersecting with :class:`CategoricalIndex` (:issue:`38653`)
- Bug in :meth:`MultiIndex.reindex` raising
ValueError
with empty MultiIndex and indexing only a specific level (:issue:`41170`)
- Bug in :meth:`Index.__repr__` when
display.max_seq_items=1
(:issue:`38415`) - Bug in :func:`read_csv` not recognizing scientific notation if decimal is set for
engine="python"
(:issue:`31920`) - Bug in :func:`read_csv` interpreting
NA
value as comment, whenNA
does contain the comment string fixed forengine="python"
(:issue:`34002`) - Bug in :func:`read_csv` raising
IndexError
with multiple header columns andindex_col
specified when file has no data rows (:issue:`38292`) - Bug in :func:`read_csv` not accepting
usecols
with different length thannames
forengine="python"
(:issue:`16469`) - Bug in :meth:`read_csv` returning object dtype when
delimiter=","
withusecols
andparse_dates
specified forengine="python"
(:issue:`35873`) - Bug in :func:`read_csv` raising
TypeError
whennames
andparse_dates
is specified forengine="c"
(:issue:`33699`) - Bug in :func:`read_clipboard`, :func:`DataFrame.to_clipboard` not working in WSL (:issue:`38527`)
- Allow custom error values for parse_dates argument of :func:`read_sql`, :func:`read_sql_query` and :func:`read_sql_table` (:issue:`35185`)
- Bug in :func:`to_hdf` raising
KeyError
when trying to apply for subclasses ofDataFrame
orSeries
(:issue:`33748`) - Bug in :meth:`~HDFStore.put` raising a wrong
TypeError
when saving a DataFrame with non-string dtype (:issue:`34274`) - Bug in :func:`json_normalize` resulting in the first element of a generator object not being included in the returned
DataFrame
(:issue:`35923`) - Bug in :func:`read_csv` applying thousands separator to date columns when column should be parsed for dates and
usecols
is specified forengine="python"
(:issue:`39365`) - Bug in :func:`read_excel` forward filling :class:`MultiIndex` names with multiple header and index columns specified (:issue:`34673`)
- :func:`read_excel` now respects :func:`set_option` (:issue:`34252`)
- Bug in :func:`read_csv` not switching
true_values
andfalse_values
for nullableboolean
dtype (:issue:`34655`) - Bug in :func:`read_json` when
orient="split"
does not maintain numeric string index (:issue:`28556`) - :meth:`read_sql` returned an empty generator if
chunksize
was no-zero and the query returned no results. Now returns a generator with a single empty dataframe (:issue:`34411`) - Bug in :func:`read_hdf` returning unexpected records when filtering on categorical string columns using
where
parameter (:issue:`39189`) - Bug in :func:`read_sas` raising
ValueError
whendatetimes
were null (:issue:`39725`) - Bug in :func:`read_excel` dropping empty values from single-column spreadsheets (:issue:`39808`)
- Bug in :func:`read_excel` loading trailing empty rows/columns for some filetypes (:issue:`41167`)
- Bug in :func:`read_excel` raising
AttributeError
withMultiIndex
header followed by two empty rows and no index, and bug affecting :func:`read_excel`, :func:`read_csv`, :func:`read_table`, :func:`read_fwf`, and :func:`read_clipboard` where one blank row after aMultiIndex
header with no index would be dropped (:issue:`40442`) - Bug in :meth:`DataFrame.to_string` misplacing the truncation column when
index=False
(:issue:`40904`) - Bug in :meth:`DataFrame.to_string` adding an extra dot and misaligning the truncation row when
index=False
(:issue:`40904`) - Bug in :func:`read_orc` always raising
AttributeError
(:issue:`40918`) - Bug in :func:`read_csv` and :func:`read_table` silently ignoring
prefix
ifnames
andprefix
are defined, now raisingValueError
(:issue:`39123`) - Bug in :func:`read_csv` and :func:`read_excel` not respecting dtype for duplicated column name when
mangle_dupe_cols
is set toTrue
(:issue:`35211`) - Bug in :func:`read_csv` silently ignoring
sep
ifdelimiter
andsep
are defined, now raisingValueError
(:issue:`39823`) - Bug in :func:`read_csv` and :func:`read_table` misinterpreting arguments when
sys.setprofile
had been previously called (:issue:`41069`) - Bug in the conversion from pyarrow to pandas (e.g. for reading Parquet) with nullable dtypes and a pyarrow array whose data buffer size is not a multiple of dtype size (:issue:`40896`)
- Bug in :func:`read_excel` would raise an error when pandas could not determine the file type, even when user specified the
engine
argument (:issue:`41225`) - Bug in :func:`read_clipboard` copying from an excel file shifts values into the wrong column if there are null values in first column (:issue:`41108`)
- Comparisons of :class:`Period` objects or :class:`Index`, :class:`Series`, or :class:`DataFrame` with mismatched
PeriodDtype
now behave like other mismatched-type comparisons, returningFalse
for equals,True
for not-equal, and raisingTypeError
for inequality checks (:issue:`39274`)
- Bug in :func:`scatter_matrix` raising when 2d
ax
argument passed (:issue:`16253`) - Prevent warnings when matplotlib's
constrained_layout
is enabled (:issue:`25261`) - Bug in :func:`DataFrame.plot` was showing the wrong colors in the legend if the function was called repeatedly and some calls used
yerr
while others didn't (partial fix of :issue:`39522`) - Bug in :func:`DataFrame.plot` was showing the wrong colors in the legend if the function was called repeatedly and some calls used
secondary_y
and others uselegend=False
(:issue:`40044`) - Bug in :meth:`DataFrame.plot.box` in box plot when
dark_background
theme was selected, caps or min/max markers for the plot was not visible (:issue:`40769`)
- Bug in :meth:`DataFrameGroupBy.agg` and :meth:`SeriesGroupBy.agg` with :class:`PeriodDtype` columns incorrectly casting results too aggressively (:issue:`38254`)
- Bug in :meth:`SeriesGroupBy.value_counts` where unobserved categories in a grouped categorical series were not tallied (:issue:`38672`)
- Bug in :meth:`SeriesGroupBy.value_counts` where error was raised on an empty series (:issue:`39172`)
- Bug in :meth:`.GroupBy.indices` would contain non-existent indices when null values were present in the groupby keys (:issue:`9304`)
- Fixed bug in :meth:`DataFrameGroupBy.sum` and :meth:`SeriesGroupBy.sum` causing loss of precision through using Kahan summation (:issue:`38778`)
- Fixed bug in :meth:`DataFrameGroupBy.cumsum`, :meth:`SeriesGroupBy.cumsum`, :meth:`DataFrameGroupBy.mean` and :meth:`SeriesGroupBy.mean` causing loss of precision through using Kahan summation (:issue:`38934`)
- Bug in :meth:`.Resampler.aggregate` and :meth:`DataFrame.transform` raising
TypeError
instead ofSpecificationError
when missing keys had mixed dtypes (:issue:`39025`) - Bug in :meth:`.DataFrameGroupBy.idxmin` and :meth:`.DataFrameGroupBy.idxmax` with
ExtensionDtype
columns (:issue:`38733`) - Bug in :meth:`Series.resample` would raise when the index was a :class:`PeriodIndex` consisting of
NaT
(:issue:`39227`) - Bug in :meth:`core.window.rolling.RollingGroupby.corr` and :meth:`core.window.expanding.ExpandingGroupby.corr` where the groupby column would return 0 instead of
np.nan
when providingother
that was longer than each group (:issue:`39591`) - Bug in :meth:`core.window.expanding.ExpandingGroupby.corr` and :meth:`core.window.expanding.ExpandingGroupby.cov` where 1 would be returned instead of
np.nan
when providingother
that was longer than each group (:issue:`39591`) - Bug in :meth:`.GroupBy.mean`, :meth:`.GroupBy.median` and :meth:`DataFrame.pivot_table` not propagating metadata (:issue:`28283`)
- Bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` not calculating window bounds correctly when window is an offset and dates are in descending order (:issue:`40002`)
- Bug in :class:`SeriesGroupBy` and :class:`DataFrameGroupBy` on an empty
Series
orDataFrame
would lose index, columns, and/or data types when directly using the methodsidxmax
,idxmin
,mad
,min
,max
,sum
,prod
, andskew
or using them throughapply
,aggregate
, orresample
(:issue:`26411`) - Bug in :meth:`DataFrameGroupBy.apply` where a :class:`MultiIndex` would be created instead of an :class:`Index` if a :class:`:meth:`core.window.rolling.RollingGroupby` object was created (:issue:`39732`)
- Bug in :meth:`DataFrameGroupBy.sample` where error was raised when
weights
was specified and the index was an :class:`Int64Index` (:issue:`39927`) - Bug in :meth:`DataFrameGroupBy.aggregate` and :meth:`.Resampler.aggregate` would sometimes raise
SpecificationError
when passed a dictionary and columns were missing; will now always raise aKeyError
instead (:issue:`40004`) - Bug in :meth:`DataFrameGroupBy.sample` where column selection was not applied to sample result (:issue:`39928`)
- Bug in :class:`core.window.ewm.ExponentialMovingWindow` when calling
__getitem__
would incorrectly raise aValueError
when providingtimes
(:issue:`40164`) - Bug in :class:`core.window.ewm.ExponentialMovingWindow` when calling
__getitem__
would not retaincom
,span
,alpha
orhalflife
attributes (:issue:`40164`) - :class:`core.window.ewm.ExponentialMovingWindow` now raises a
NotImplementedError
when specifyingtimes
withadjust=False
due to an incorrect calculation (:issue:`40098`) - Bug in :meth:`core.window.ewm.ExponentialMovingWindowGroupby.mean` where the times argument was ignored when
engine='numba'
(:issue:`40951`) - Bug in :meth:`core.window.ewm.ExponentialMovingWindowGroupby.mean` where the wrong times were used in case of multiple groups (:issue:`40951`)
- Bug in :class:`core.window.ewm.ExponentialMovingWindowGroupby` where the times vector and values became out of sync for non-trivial groups (:issue:`40951`)
- Bug in :meth:`Series.asfreq` and :meth:`DataFrame.asfreq` dropping rows when the index is not sorted (:issue:`39805`)
- Bug in aggregation functions for :class:`DataFrame` not respecting
numeric_only
argument whenlevel
keyword was given (:issue:`40660`) - Bug in :meth:`SeriesGroupBy.aggregate` where using a user-defined function to aggregate a
Series
with an object-typed :class:`Index` causes an incorrect :class:`Index` shape (issue:40014) - Bug in :class:`core.window.RollingGroupby` where
as_index=False
argument ingroupby
was ignored (:issue:`39433`) - Bug in :meth:`.GroupBy.any` and :meth:`.GroupBy.all` raising
ValueError
when using with nullable type columns holdingNA
even withskipna=True
(:issue:`40585`) - Bug in :meth:`GroupBy.cummin` and :meth:`GroupBy.cummax` incorrectly rounding integer values near the
int64
implementations bounds (:issue:`40767`) - Bug in :meth:`.GroupBy.rank` with nullable dtypes incorrectly raising
TypeError
(:issue:`41010`) - Bug in :meth:`.GroupBy.cummin` and :meth:`.GroupBy.cummax` computing wrong result with nullable data types too large to roundtrip when casting to float (:issue:`37493`)
- Bug in :meth:`DataFrame.rolling` returning mean zero for all
NaN
window withmin_periods=0
if calculation is not numerical stable (:issue:`41053`) - Bug in :meth:`DataFrame.rolling` returning sum not zero for all
NaN
window withmin_periods=0
if calculation is not numerical stable (:issue:`41053`) - Bug in :meth:`SeriesGroupBy.agg` failing to retain ordered :class:`CategoricalDtype` on order-preserving aggregations (:issue:`41147`)
- Bug in :meth:`DataFrameGroupBy.min` and :meth:`DataFrameGroupBy.max` with multiple object-dtype columns and
numeric_only=False
incorrectly raisingValueError
(:issue:41111`) - Bug in :meth:`DataFrameGroupBy.rank` with the GroupBy object's
axis=0
and therank
method's keywordaxis=1
(:issue:`41320`) - Bug in :meth:`DataFrameGroupBy.__getitem__` with non-unique columns incorrectly returning a malformed :class:`SeriesGroupBy` instead of :class:`DataFrameGroupBy` (:issue:`41427`)
- Bug in :meth:`DataFrameGroupBy.transform` with non-unique columns incorrectly raising
AttributeError
(:issue:`41427`) - Bug in :meth:`Resampler.apply` with non-unique columns incorrectly dropping duplicated columns (:issue:`41445`)
- Bug in :meth:`SeriesGroupBy` aggregations incorrectly returning empty :class:`Series` instead of raising
TypeError
on aggregations that are invalid for its dtype, e.g..prod
withdatetime64[ns]
dtype (:issue:`41342`) - Bug in :meth:`DataFrame.rolling.__iter__` where
on
was not assigned to the index of the resulting objects (:issue:`40373`) - Bug in :meth:`DataFrameGroupBy.transform` and :meth:`DataFrameGroupBy.agg` with
engine="numba"
where*args
were being cached with the user passed function (:issue:`41647`)
- Bug in :func:`merge` raising error when performing an inner join with partial index and
right_index
when no overlap between indices (:issue:`33814`) - Bug in :meth:`DataFrame.unstack` with missing levels led to incorrect index names (:issue:`37510`)
- Bug in :func:`merge_asof` propagating the right Index with
left_index=True
andright_on
specification instead of left Index (:issue:`33463`) - Bug in :func:`join` over :class:`MultiIndex` returned wrong result, when one of both indexes had only one level (:issue:`36909`)
- :meth:`merge_asof` raises
ValueError
instead of crypticTypeError
in case of non-numerical merge columns (:issue:`29130`) - Bug in :meth:`DataFrame.join` not assigning values correctly when having :class:`MultiIndex` where at least one dimension is from dtype
Categorical
with non-alphabetically sorted categories (:issue:`38502`) - :meth:`Series.value_counts` and :meth:`Series.mode` return consistent keys in original order (:issue:`12679`, :issue:`11227` and :issue:`39007`)
- Bug in :meth:`DataFrame.stack` not handling
NaN
in :class:`MultiIndex` columns correct (:issue:`39481`) - Bug in :meth:`DataFrame.apply` would give incorrect results when used with a string argument and
axis=1
when the axis argument was not supported and now raises aValueError
instead (:issue:`39211`) - Bug in :meth:`DataFrame.sort_values` not reshaping index correctly after sorting on columns, when
ignore_index=True
(:issue:`39464`) - Bug in :meth:`DataFrame.append` returning incorrect dtypes with combinations of
ExtensionDtype
dtypes (:issue:`39454`) - Bug in :meth:`DataFrame.append` returning incorrect dtypes with combinations of
datetime64
andtimedelta64
dtypes (:issue:`39574`) - Bug in :meth:`DataFrame.pivot_table` returning a
MultiIndex
for a single value when operating on and emptyDataFrame
(:issue:`13483`) - Allow :class:`Index` to be passed to the :func:`numpy.all` function (:issue:`40180`)
- Bug in :meth:`DataFrame.stack` not preserving
CategoricalDtype
in aMultiIndex
(:issue:`36991`) - Bug in :func:`to_datetime` raising error when input sequence contains unhashable items (:issue:`39756`)
- Bug in :meth:`Series.explode` preserving index when
ignore_index
wasTrue
and values were scalars (:issue:`40487`) - Bug in :func:`to_datetime` raising
ValueError
when :class:`Series` containsNone
andNaT
and has more than 50 elements (:issue:`39882`)
- Bug in :meth:`DataFrame.sparse.to_coo` raising
KeyError
with columns that are a numeric :class:`Index` without a 0 (:issue:`18414`) - Bug in :meth:`SparseArray.astype` with
copy=False
producing incorrect results when going from integer dtype to floating dtype (:issue:`34456`) - Implemented :meth:`SparseArray.max` and :meth:`SparseArray.min` (:issue:`40921`)
- Bug in :meth:`DataFrame.where` when
other
is a :class:`Series` with :class:`ExtensionArray` dtype (:issue:`38729`) - Fixed bug where :meth:`Series.idxmax`, :meth:`Series.idxmin` and
argmax/min
fail when the underlying data is :class:`ExtensionArray` (:issue:`32749`, :issue:`33719`, :issue:`36566`) - Fixed a bug where some properties of subclasses of :class:`PandasExtensionDtype` where improperly cached (:issue:`40329`)
- Bug in :meth:`DataFrame.mask` where masking a :class:`Dataframe` with an :class:`ExtensionArray` dtype raises
ValueError
(:issue:`40941`)
- Bug in :class:`Styler` where
subset
arg in methods raised an error for some valid multiindex slices (:issue:`33562`) - :class:`Styler` rendered HTML output minor alterations to support w3 good code standard (:issue:`39626`)
- Bug in :class:`Styler` where rendered HTML was missing a column class identifier for certain header cells (:issue:`39716`)
- Bug in :meth:`Styler.background_gradient` where text-color was not determined correctly (:issue:`39888`)
- Bug in :class:`Styler` where multiple elements in CSS-selectors were not correctly added to
table_styles
(:issue:`39942`) - Bug in :class:`.Styler` where copying from Jupyter dropped top left cell and misaligned headers (:issue:`12147`)
- Bug in :class:`.Styler.where` where
kwargs
were not passed to the applicable callable (:issue:`40845`) - Bug in :class:`Styler` which caused CSS to duplicate on multiple renders. (:issue:`39395`, :issue:`40334`)
- Bug in :class:`Index` constructor sometimes silently ignoring a specified
dtype
(:issue:`38879`) - Bug in :func:`pandas.api.types.infer_dtype` not recognizing Series, Index or array with a period dtype (:issue:`23553`)
- Bug in :func:`pandas.api.types.infer_dtype` raising an error for general :class:`.ExtensionArray` objects. It will now return
"unknown-array"
instead of raising (:issue:`37367`) - Bug in constructing a :class:`Series` from a list and a :class:`PandasDtype` (:issue:`39357`)
inspect.getmembers(Series)
no longer raises anAbstractMethodError
(:issue:`38782`)- Bug in :meth:`Series.where` with numeric dtype and
other = None
not casting tonan
(:issue:`39761`) - :meth:`Index.where` behavior now mirrors :meth:`Index.putmask` behavior, i.e.
index.where(mask, other)
matchesindex.putmask(~mask, other)
(:issue:`39412`) - Bug in :func:`pandas.testing.assert_series_equal`, :func:`pandas.testing.assert_frame_equal`, :func:`pandas.testing.assert_index_equal` and :func:`pandas.testing.assert_extension_array_equal` incorrectly raising when an attribute has an unrecognized NA type (:issue:`39461`)
- Bug in :func:`pandas.testing.assert_index_equal` with
exact=True
not raising when comparing :class:`CategoricalIndex` instances withInt64Index
andRangeIndex
categories (:issue:`41263`) - Bug in :meth:`DataFrame.equals`, :meth:`Series.equals`, :meth:`Index.equals` with object-dtype containing
np.datetime64("NaT")
ornp.timedelta64("NaT")
(:issue:`39650`) - Bug in :func:`pandas.util.show_versions` where console JSON output was not proper JSON (:issue:`39701`)
- Let Pandas compile on z/OS when using xlc (:issue:`35826`)
- Bug in :meth:`DataFrame.convert_dtypes` incorrectly raised ValueError when called on an empty DataFrame (:issue:`40393`)
- Bug in :meth:`DataFrame.agg()` not sorting the aggregated axis in the order of the provided aggragation functions when one or more aggregation function fails to produce results (:issue:`33634`)
- Bug in :meth:`DataFrame.clip` not interpreting missing values as no threshold (:issue:`40420`)
- Bug in :class:`Series` backed by :class:`DatetimeArray` or :class:`TimedeltaArray` sometimes failing to set the array's
freq
toNone
(:issue:`41425`) - Bug in creating a :class:`Series` from a
range
object that does not fit in the bounds ofint64
dtype (:issue:`30173`)