We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After seeing this, I tried to use the new by syntax, an MWE is
by
using DataFrames using Statistics df = DataFrame(a = repeat([1, 2, 3, 4], outer=[2]), b = repeat([2, 1], outer=[4]), c = 1:8); julia> df 8×3 DataFrame │ Row │ a │ b │ c │ │ │ Int64 │ Int64 │ Int64 │ ├─────┼───────┼───────┼───────┤ │ 1 │ 1 │ 2 │ 1 │ │ 2 │ 2 │ 1 │ 2 │ │ 3 │ 3 │ 2 │ 3 │ │ 4 │ 4 │ 1 │ 4 │ │ 5 │ 1 │ 2 │ 5 │ │ 6 │ 2 │ 1 │ 6 │ │ 7 │ 3 │ 2 │ 7 │ │ 8 │ 4 │ 1 │ 8 │ julia> by(df, :, :a, :c=>mean) 8×4 DataFrame │ Row │ a │ b │ c │ c_mean │ │ │ Int64 │ Int64 │ Int64 │ Float64 │ ├─────┼───────┼───────┼───────┼─────────┤ │ 1 │ 1 │ 2 │ 1 │ 1.0 │ │ 2 │ 2 │ 1 │ 2 │ 2.0 │ │ 3 │ 3 │ 2 │ 3 │ 3.0 │ │ 4 │ 4 │ 1 │ 4 │ 4.0 │ │ 5 │ 1 │ 2 │ 5 │ 5.0 │ │ 6 │ 2 │ 1 │ 6 │ 6.0 │ │ 7 │ 3 │ 2 │ 7 │ 7.0 │ │ 8 │ 4 │ 1 │ 8 │ 8.0 │
or
julia> by(df, :, [:a], :c=>mean) 8×4 DataFrame │ Row │ a │ b │ c │ c_mean │ │ │ Int64 │ Int64 │ Int64 │ Float64 │ ├─────┼───────┼───────┼───────┼─────────┤ │ 1 │ 1 │ 2 │ 1 │ 1.0 │ │ 2 │ 2 │ 1 │ 2 │ 2.0 │ │ 3 │ 3 │ 2 │ 3 │ 3.0 │ │ 4 │ 4 │ 1 │ 4 │ 4.0 │ │ 5 │ 1 │ 2 │ 5 │ 5.0 │ │ 6 │ 2 │ 1 │ 6 │ 6.0 │ │ 7 │ 3 │ 2 │ 7 │ 7.0 │ │ 8 │ 4 │ 1 │ 8 │ 8.0 │
Note: DataFrames has been updated to [a93c6f00] DataFrames v0.20.0 #master (https://github.com/JuliaData/DataFrames.jl.git)
[a93c6f00] DataFrames v0.20.0 #master (https://github.com/JuliaData/DataFrames.jl.git)
what I would expect is
julia> by(df, :, [:a], :c=>mean) 8×4 DataFrame │ Row │ a │ b │ c │ c_mean │ │ │ Int64 │ Int64 │ Int64 │ Float64 │ ├─────┼───────┼───────┼───────┼─────────┤ │ 1 │ 1 │ 2 │ 1 │ 3.0 │ │ 2 │ 2 │ 1 │ 2 │ 4.0 │ │ 3 │ 3 │ 2 │ 3 │ 5.0 │ │ 4 │ 4 │ 1 │ 4 │ 6.0 │ │ 5 │ 1 │ 2 │ 5 │ 3.0 │ │ 6 │ 2 │ 1 │ 6 │ 4.0 │ │ 7 │ 3 │ 2 │ 7 │ 5.0 │ │ 8 │ 4 │ 1 │ 8 │ 6.0 │
BTW: I can track down this issue but maybe a little bit later. Thanks.
The text was updated successfully, but these errors were encountered:
You use the wrong order of arguments:
julia> by(df, :a, :, :c=>mean) 8×4 DataFrame │ Row │ a │ b │ c │ c_mean │ │ │ Int64 │ Int64 │ Int64 │ Float64 │ ├─────┼───────┼───────┼───────┼─────────┤ │ 1 │ 1 │ 2 │ 1 │ 3.0 │ │ 2 │ 1 │ 2 │ 5 │ 3.0 │ │ 3 │ 2 │ 1 │ 2 │ 4.0 │ │ 4 │ 2 │ 1 │ 6 │ 4.0 │ │ 5 │ 3 │ 2 │ 3 │ 5.0 │ │ 6 │ 3 │ 2 │ 7 │ 5.0 │ │ 7 │ 4 │ 1 │ 4 │ 6.0 │ │ 8 │ 4 │ 1 │ 8 │ 6.0 │
Though it does not produce what you want exactly, because you get a different row order. This can be fixed by calling e.g.:
julia> sort!(by(df, :a, :, :c=>mean), :c) 8×4 DataFrame │ Row │ a │ b │ c │ c_mean │ │ │ Int64 │ Int64 │ Int64 │ Float64 │ ├─────┼───────┼───────┼───────┼─────────┤ │ 1 │ 1 │ 2 │ 1 │ 3.0 │ │ 2 │ 2 │ 1 │ 2 │ 4.0 │ │ 3 │ 3 │ 2 │ 3 │ 5.0 │ │ 4 │ 4 │ 1 │ 4 │ 6.0 │ │ 5 │ 1 │ 2 │ 5 │ 3.0 │ │ 6 │ 2 │ 1 │ 6 │ 4.0 │ │ 7 │ 3 │ 2 │ 7 │ 5.0 │ │ 8 │ 4 │ 1 │ 8 │ 6.0 │
but admittedly in some cases you can expect the order to be preserved without having to call sort!.
sort!
The latter thing is tracked in #2172 so I am closing this issue (but please comment if I have missed something from your original post).
Sorry, something went wrong.
I see. The original post in discourse suggested the wrong order. Thanks.
No branches or pull requests
After seeing this, I tried to use the new
by
syntax, an MWE isor
Note: DataFrames has been updated to
[a93c6f00] DataFrames v0.20.0 #master (https://github.com/JuliaData/DataFrames.jl.git)
what I would expect is
BTW: I can track down this issue but maybe a little bit later. Thanks.
The text was updated successfully, but these errors were encountered: