BUG-17280 to_html follows display.precision for column numbers in notebooks #25914

JustinZhengBC · 2019-03-28T21:19:34Z

closes display.precision not honored for column headers #17280
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

When printing column labels, check if they are floats and if they are, then round according to display.precision preferences

codecov · 2019-03-28T22:12:14Z

Codecov Report

Merging #25914 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #25914      +/-   ##
==========================================
- Coverage   91.77%   91.77%   -0.01%     
==========================================
  Files         175      175              
  Lines       52607    52610       +3     
==========================================
- Hits        48282    48281       -1     
- Misses       4325     4329       +4

Flag	Coverage Δ
#multiple	`90.32% <100%> (ø)`	⬆️
#single	`41.9% <0%> (-0.08%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/formats/html.py	`99.36% <100%> (ø)`	⬆️
pandas/io/gbq.py	`75% <0%> (-12.5%)`	⬇️
pandas/core/frame.py	`96.79% <0%> (-0.12%)`	⬇️
pandas/io/formats/css.py	`100% <0%> (ø)`	⬆️
pandas/io/formats/excel.py	`97.4% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8b9f933...9040b98. Read the comment docs.

codecov · 2019-03-28T22:12:20Z

Codecov Report

Merging #25914 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #25914      +/-   ##
==========================================
- Coverage   91.84%   91.83%   -0.01%     
==========================================
  Files         175      175              
  Lines       52550    52554       +4     
==========================================
- Hits        48266    48265       -1     
- Misses       4284     4289       +5

Flag	Coverage Δ
#multiple	`90.39% <100%> (ø)`	⬆️
#single	`41.89% <0%> (-0.07%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/formats/html.py	`99.36% <100%> (ø)`	⬆️
pandas/io/gbq.py	`75% <0%> (-12.5%)`	⬇️
pandas/core/frame.py	`96.79% <0%> (-0.12%)`	⬇️
pandas/util/testing.py	`90.61% <0%> (-0.11%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4814a28...ebcc8f8. Read the comment docs.

simonjayhawkins · 2019-03-28T22:38:35Z

As this is a display option, shouldn't it only affect the notebook display and leave the to_html untouched?

JustinZhengBC · 2019-03-29T01:44:02Z

I have tested this fix in a notebook and it works as expected. As for whether this is the appropriate location for a fix, NotebookFormatter is the class responsible for displaying data in Jupyter notebooks, and it inherits from HTMLFormatter (and the bug was present in to_html(notebook=False) as well).

simonjayhawkins · 2019-03-29T01:56:45Z

and the bug was present in to_html(notebook=False) as well

I would argue it is not a bug here. since the display options should not affect the to_html() output. compare with the display options, max_rows, max_columns, show_dimensions, max_colwidth, etc. which only apply to the notebook display.

pep8speaks · 2019-03-29T02:35:15Z

Hello @JustinZhengBC! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-04-03 15:23:23 UTC

JustinZhengBC · 2019-03-29T02:37:20Z

I agree. I have altered the fix so it only applies to NotebookFormatter and not HTMLFormatter, since to_html is used for display purposes in a notebook.

jreback · 2019-03-29T12:26:45Z

pandas/tests/io/formats/test_to_html.py

+
+
+def test_to_html_round_column_headers():
+    df = DataFrame([1], columns=[0.55555])


add the issue number

jreback · 2019-03-29T12:27:38Z

pandas/io/formats/html.py

@@ -491,6 +491,15 @@ class NotebookFormatter(HTMLFormatter):
    DataFrame._repr_html_() and DataFrame.to_html(notebook=True)
    """

+    def __init__(self, formatter, classes=None, border=None):


does this really need to be in __init__, rather have this call a method in the base class which is overriden here, e.g.

self.columns = self._get_columns_formatted_values()

rather have this call a method in the base class which is overriden here

in #24651, i separated out the notebook functionality from HTMLFormatter into NotebookFormatter using HTMLFormatter as the base class. the base class of HTMLFormatter is TableFormatter which is also the base class of DataFrameFormatter (used for to_string and to _latex) and LatexFormatter.

I envisaged that at some point in the development cycle it may become desirable to also create a ToHTMLFormatter using HTMLFormatter as the base.

IMO HTMLFormatter and NotebookFormatter are not the appropriate location for non-markup related formatting issues that are common across output-formatting methods. I think this issue is in that category and should ideally be in DataFrameFormatter or TableFormatter.

However, if to close the open issue the fix is somewhere in io/formats/html.py, then i think a TODO to remove the code at a later date should be added.

and the bug was present in to_html(notebook=False) as well

I would argue it is not a bug here. since the display options should not affect the to_html() output. compare with the display options, max_rows, max_columns, show_dimensions, max_colwidth, etc. which only apply to the notebook display.

I think I'm getting a bit confused here. Doesn't this mean that this is not an issue that is common across formatters? Because to_html should only check display preferences when used for display purposes in a notebook and not when generating HTML for a web page?

yeah. i can understand why your confused, there is a slight problem with the current class hierarchy. the workaround for max_colwidth was to use with option_context to ignore the display option for to_html, see #24841

i think the ideal solution would be to have say a use_display_options attribute in DataFrameFormatter and then in HTMLFormatter we just have self.fmt.use_display_options = False and in NoteBookFormatter we have self.fmt.use_display_options = True and then all the display formatting could be handled in io/formats/format.py.

But this is way outside the scope of this PR. So a with option_context work-around would be fine for now IMO.

simonjayhawkins · 2019-03-29T12:55:15Z

As for whether this is the appropriate location for a fix

i've opened separate issues for float values in index names #25917, object indexes #25919 and float-like values #25920. So this is probably not the appropriate location but i guess as a bug-fix that would be OK.

simonjayhawkins · 2019-03-30T09:53:38Z

I think if, for now, you just change

pandas/pandas/io/formats/html.py

Line 295 in 96a128e

row.extend(self.columns)

to

row.extend(self.columns.format())

it'll fix the open issue by making the behavior for the single level column labels consistent with the index labels and the multiIndex column labels.

if the behavior should be different in to_html() output than the notebook display then this would probably need to include changing the index labels and the multiIndex column labels, so outside the scope of this PR.

then in the future we probably need a _get_formatted_column_labels method for code parity with DataFrameFormatter

JustinZhengBC · 2019-04-01T23:30:55Z

@simonjayhawkins your fix works. It also changes columns with None to NaN as a side effect. Is that okay? If so we can go with that.

simonjayhawkins · 2019-04-02T00:22:21Z

It also changes columns with None to NaN as a side effect. Is that okay? If so we can go with that.

i'm getting nan changed to NaN for None, which is consistent with the index and the multiIndex columns cases.

import pandas as pd
import numpy as np
pd.options.display.precision = 3
labels = np.random.rand(10)
labels[3] = None
p = pd.DataFrame(np.random.randn(10,10),index=labels,columns=labels)
p

	0.04785803831057456	0.6401016774531172	0.9164407229270924	nan	0.6926779069059682	0.4610792740713505	0.9600160671031376	0.35015957890745153	0.14431674279365037	0.6268220237028823
0.048	-1.836	-0.269	1.081	0.267	-0.195	-0.586	0.916	2.162	0.572	0.198
0.640	0.757	0.656	1.502	-0.705	0.379	1.254	-0.393	-0.478	0.940	1.735
0.916	0.298	0.816	0.239	-0.166	-0.503	-0.261	2.781	-1.195	-0.613	1.763
NaN	-0.792	0.622	-0.738	0.489	-1.947	-0.804	0.578	-0.086	-1.157	0.471
0.693	-0.414	-0.122	-1.248	-0.410	-1.132	0.315	-1.116	-0.746	-0.977	2.093
0.461	2.213	-0.615	0.594	-0.262	1.749	0.994	-0.539	1.187	0.156	0.673
0.960	-1.578	0.199	0.903	-0.578	0.859	0.969	-0.030	-1.371	-0.330	-0.141
0.350	0.023	1.053	0.902	1.489	1.278	-1.473	0.910	0.197	-0.461	0.714
0.144	1.797	0.233	-0.672	-0.288	-0.455	-0.872	-1.535	-2.370	1.545	-0.962
0.627	-2.079	-0.473	-2.075	0.391	-0.228	0.751	0.155	0.648	-0.679	-1.043

p.columns.format()
['0.249',
 '0.833',
 '0.730',
 'NaN  ',
 '0.520',
 '0.545',
 '0.960',
 '0.233',
 '0.247',
 '0.990']

multi = pd.MultiIndex.from_arrays([labels,labels])
p = pd.DataFrame(np.random.randn(10,10),index=multi,columns=multi)
p

		0.249	0.833	0.730	NaN	0.520	0.545	0.960	0.233	0.247	0.990
		0.249	0.833	0.730	NaN	0.520	0.545	0.960	0.233	0.247	0.990
0.249	0.249	0.419	0.043	-0.141	0.408	-0.040	1.376	-0.041	-1.439	0.192	-1.223
0.833	0.833	0.880	-1.105	-0.302	-0.652	-1.237	0.851	-0.481	1.156	-0.787	1.032
0.730	0.730	-0.939	-0.504	-0.397	0.647	-0.898	-0.219	-0.285	-1.120	-1.268	-0.277
NaN	NaN	-0.072	0.020	1.558	0.463	-1.322	0.388	0.373	0.107	-1.913	0.370
0.520	0.520	-0.343	-0.121	2.011	0.068	0.409	-0.326	-0.485	1.271	1.244	-0.313
0.545	0.545	-0.977	-0.755	0.145	0.154	0.293	1.243	1.441	-0.198	-0.318	0.339
0.960	0.960	-1.041	1.479	-1.120	0.101	-1.302	1.436	0.572	-0.806	1.306	-1.249
0.233	0.233	-0.021	-0.742	-0.394	0.185	0.028	-0.999	1.373	-1.598	1.140	-0.386
0.247	0.247	0.280	-0.246	0.222	-0.767	-1.433	0.144	-0.466	0.920	0.941	1.195
0.990	0.990	0.633	-1.737	-0.649	0.396	-0.007	-0.897	0.635	1.555	-2.001	0.047

simonjayhawkins · 2019-04-03T09:15:07Z

@JustinZhengBC : lgtm. #25914 (comment) still to do?

…nto BUG-17280

JustinZhengBC · 2019-04-03T22:46:50Z

@simonjayhawkins sorry about that, thanks for catching it.

jreback · 2019-04-04T13:02:03Z

@simonjayhawkins this is orthogonal to your other PR? #25983

simonjayhawkins · 2019-04-04T14:03:52Z

@simonjayhawkins this is orthogonal to your other PR? #25983

yes. this is ok. probably a merge conflict on the test, but just an accept both.

jreback · 2019-04-04T20:18:18Z

thanks @JustinZhengBC

BUG-17280 to_html follows display.precision for column numbers

9040b98

applies to notebooks only

21d8273

fix whitespace

8e0b7ee

JustinZhengBC changed the title ~~BUG-17280 to_html follows display.precision for column numbers~~ BUG-17280 to_html follows display.precision for column numbers in notebooks Mar 29, 2019

fix whitespace

87c183e

jreback requested changes Mar 29, 2019

View reviewed changes

jreback added the Output-Formatting __repr__ of pandas objects, to_string label Mar 29, 2019

move formatting to _get_columns_formatted_values function

f7d8086

JustinZhengBC added 2 commits April 2, 2019 23:42

use self.columns.format()

50aea27

Merge branch 'master' into BUG-17280

3998835

JustinZhengBC added 2 commits April 3, 2019 08:22

add issue number

4ea1932

Merge branch 'BUG-17280' of https://github.com/justinzhengbc/pandas i…

ebcc8f8

…nto BUG-17280

jreback added this to the 0.25.0 milestone Apr 4, 2019

jreback approved these changes Apr 4, 2019

View reviewed changes

jreback merged commit 013f4b4 into pandas-dev:master Apr 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG-17280 to_html follows display.precision for column numbers in notebooks #25914

BUG-17280 to_html follows display.precision for column numbers in notebooks #25914

JustinZhengBC commented Mar 28, 2019

codecov bot commented Mar 28, 2019

codecov bot commented Mar 28, 2019 •

edited

Loading

simonjayhawkins commented Mar 28, 2019

JustinZhengBC commented Mar 29, 2019

simonjayhawkins commented Mar 29, 2019

pep8speaks commented Mar 29, 2019 •

edited

Loading

JustinZhengBC commented Mar 29, 2019

jreback Mar 29, 2019

jreback Mar 29, 2019

simonjayhawkins Mar 29, 2019

JustinZhengBC Mar 29, 2019

simonjayhawkins Mar 29, 2019

simonjayhawkins Mar 29, 2019

simonjayhawkins commented Mar 29, 2019

simonjayhawkins commented Mar 30, 2019

JustinZhengBC commented Apr 1, 2019

simonjayhawkins commented Apr 2, 2019

simonjayhawkins commented Apr 3, 2019

JustinZhengBC commented Apr 3, 2019

jreback commented Apr 4, 2019

simonjayhawkins commented Apr 4, 2019

jreback commented Apr 4, 2019



		def test_to_html_round_column_headers():
		df = DataFrame([1], columns=[0.55555])

BUG-17280 to_html follows display.precision for column numbers in notebooks #25914

BUG-17280 to_html follows display.precision for column numbers in notebooks #25914

Conversation

JustinZhengBC commented Mar 28, 2019

codecov bot commented Mar 28, 2019

Codecov Report

codecov bot commented Mar 28, 2019 • edited Loading

Codecov Report

simonjayhawkins commented Mar 28, 2019

JustinZhengBC commented Mar 29, 2019

simonjayhawkins commented Mar 29, 2019

pep8speaks commented Mar 29, 2019 • edited Loading

Comment last updated at 2019-04-03 15:23:23 UTC

JustinZhengBC commented Mar 29, 2019

jreback Mar 29, 2019

Choose a reason for hiding this comment

jreback Mar 29, 2019

Choose a reason for hiding this comment

simonjayhawkins Mar 29, 2019

Choose a reason for hiding this comment

JustinZhengBC Mar 29, 2019

Choose a reason for hiding this comment

simonjayhawkins Mar 29, 2019

Choose a reason for hiding this comment

simonjayhawkins Mar 29, 2019

Choose a reason for hiding this comment

simonjayhawkins commented Mar 29, 2019

simonjayhawkins commented Mar 30, 2019

JustinZhengBC commented Apr 1, 2019

simonjayhawkins commented Apr 2, 2019

simonjayhawkins commented Apr 3, 2019

JustinZhengBC commented Apr 3, 2019

jreback commented Apr 4, 2019

simonjayhawkins commented Apr 4, 2019

jreback commented Apr 4, 2019

codecov bot commented Mar 28, 2019 •

edited

Loading

pep8speaks commented Mar 29, 2019 •

edited

Loading