-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding show()
option missingstring
to change the string that prints for missing
values
#1688
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the proposal. Here are some general comments that could be included in it (apart from the detailed recommendations I have left inline):
- even in this limited scope we should also add methods in dataframerow/show;
- handling CSV and TSV should be easy as
printtable
already handles this keyword argument; - I would not leave HTML and LaTeX for later as we tended to do it in the past and in the long term this leads to inconsistency in the framework that is hard to clean-up as these are small details;
- all functionality we add should have test cases added
Co-Authored-By: ianshmean <[email protected]>
Note that LaTeX has the default missingstring="" as prior default was "", unlike other functions
Thanks @bkamins. I've believe I've added all you listed |
Note that the default for latex is |
Thank you. Please let me know when the tests pass and I will review the changes. Regarding LaTeX - please change the default to The reason to change it to |
@bkamins The latex default is now "missing" and the tests now pass, however I had to make changes to the MIME types that need review |
@@ -92,11 +92,11 @@ function html_escape(cell::AbstractString) | |||
return cell | |||
end | |||
|
|||
Base.show(io::IO, mime::MIME"text/html", df::AbstractDataFrame; summary::Bool=true) = | |||
_show(io, mime, df, summary=summary) | |||
Base.show(io::IO, mime::MIME"text/html", df::AbstractDataFrame; summary::Bool=true, missingstring::AbstractString="missing") = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please correctly align the indentation of the code here and in all places below/
Looks good to me. I have three comments: MIME
I think we should provide methods performing auto-conversion like in Base where we hace (@nalimilan - do you have the same opinion here?):
(of course with proper type restrictions; one has to be careful here to check that method ambiguities were not introduced) In particular the PR should be checked on Jupyter notebook + HTML and LaTeX/PDF export to make sure that nothing fails (I can do it if would be a problem for you - please let me know) - the reason is that in the current state the PR is very likely to fail there (but I have not checked; however, it breaks backward compatibility without the fix I propose so there is a risk of failure) DoumentationIt would be excellent to add a small note in the documentation about the newly introduced features RebasingThe test files require rebasing because today a major test code clean-up was performed. The major change was removal of indentation in Thank you for your contribution (and sorry that the seemingly simple PR became more work than expected but we are trying hard to hit DataFrames.jl 1.0 release with a clean package). |
Thanks for the guidance @bkamins. I would prefer to not do the rebase, so I graciously accept the offer! :) thanks |
test/io.jl
Outdated
end | ||
|
||
@testset "Huge LaTeX export" begin | ||
df = DataFrame(a=1:1000) | ||
ioc = IOContext(IOBuffer(), :displaysize => (10, 10), :limit => false) | ||
show(ioc, "text/latex", df) | ||
show(ioc, MIME(Symbol("text/latex")), df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Symbol
isn't needed AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was that it should be just "text/latex"
if we fix the signatures.
Indeed the fallback method doesn't support keyword arguments. Something like this shouldn't introduce ambiguities: show(io::IO, m::AbstractString, x::AbstractDataFrame; kwargs...) = show(io, MIME(m), x; kwargs...) But we should file an issue or PR against Base to improve this. |
I am not sure methods in base are assumed to accept keyword arguments. AFAICT the intended way to do it is to pass the arguments in My thinking was to define the method also you have proposed - at least short term and decide about |
@ianshmean I start rebasing it so please do not push anything to this PR in the meantime 😄. |
Done. A small remark - it is safer to make a PR in a branch not on master (nothing big, but GitHub complains). I have also re-introduced |
It's not clear whether all of these options should go through |
I am also not 100% clear 😄 - but I have checked that out of 270 methods for |
@ianshmean + @nalimilan any opinion what we should do with this PR (note that it requires a serious rebasing since we did a heavy cleanup of |
If @ianshmean can rebase it I think we should add the |
I mark it non-breaking as I understand what is proposed here can be added at any time. |
@ianshmean - do you think you would find time to work on this PR to try finalizing it? |
@bkamins Sorry, I don't have time to complete this now |
OK - thank you for the response and submitting the PR. If PrettyTables.jl has this possibility I will close this PR then. |
In PrettyTables, missing is shown as “missing”. However it is trivial to use a data formatted to change missing to “-“, for example. julia> df = DataFrame(a=1:3,b=[1,missing,3]);
julia> pretty_table(df, formatters = ((v,i,j)->ismissing(v) ? "-" : v,))
┌───────┬───────────────────────┐
│ a │ b │
│ Int64 │ Union{Missing, Int64} │
├───────┼───────────────────────┤
│ 1 │ 1 │
│ 2 │ - │
│ 3 │ 3 │
└───────┴───────────────────────┘ |
Thank you for responding. This is what I have assumed, in this case I close it. As a reference:
can be written as:
|
I've found that it's sometimes visually helpful to print a smaller string than "missing" for
missing
values. This provides a way to change the string duringshow()
, if desired.i.e.
show(df,missingstring="-")
Further work is needed for non-default outputs (HTML, LaTeX, CSV and TSV)