Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupby agg with rank and parameter return does not reduce #14741

Open
jesrael opened this issue Nov 25, 2016 · 2 comments
Open

groupby agg with rank and parameter return does not reduce #14741

jesrael opened this issue Nov 25, 2016 · 2 comments
Labels
Apply Apply, Aggregate, Transform, Map Bug Groupby

Comments

@jesrael
Copy link

jesrael commented Nov 25, 2016

Sample with Multiindex in columns:

df = pd.DataFrame({('r', 'c'): {(1, '2016-11-01 00:00:00+00:00', 3121): 143, (1, '2016-11-01 00:00:00+00:00', 4880): 12, (1, '2016-11-01 00:00:00+00:00', 3953): 4, (1, '2016-11-01 00:00:00+00:00', 3923): 11}})
df.index.names = ['z','x','y']  
print (df)
                                    r
                                    c
z x                         y        
1 2016-11-01 00:00:00+00:00 3121  143
                            3923   11
                            3953    4
                            4880   12
                            
x = 'x'
y = 'y'

#works perfect
print (df.groupby(level=[x, y]).agg({('r', 'c'): 'rank'}))

print (df.groupby(level=[x, y]).agg({('r', 'c'): lambda x: x.rank(ascending=False)}))
#ValueError: Function does not reduce

Problem is with non MultiIndex columns also:

df = pd.DataFrame({'A':[1,1,3,3],
                   'B':[4,5,6,1]})

print (df)
   A  B
0  1  4
1  1  5
2  3  6
3  3  1

print (df.groupby('A').agg({'B': 'rank'}))
     B
0  1.0
1  2.0
2  2.0
3  1.0
print (df.groupby('A').agg({'B': lambda x: x.rank()}))
#Exception: Must produce aggregated value
print (df.groupby('A').agg({'B': lambda x: x.rank(ascending=False)}))
#Exception: Must produce aggregated value

Problem is how can I use function with parameter in agg function? SO question

print (pd.show_versions())
INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: sk_SK
LOCALE: None.None

pandas: 0.19.1
nose: 1.3.7
pip: 8.1.1
setuptools: 20.3
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: 0.2.1
None

Thank you for pandas and for your perfect documentation.

@jreback
Copy link
Contributor

jreback commented Nov 25, 2016

In [13]: df.groupby(level=[x,y]).rank(ascending=False)
Out[13]: 
                                    r
                                    c
z x                         y        
1 2016-11-01 00:00:00+00:00 3121  1.0
                            3923  1.0
                            3953  1.0
                            4880  1.0

In [14]: df.groupby(level=[x,y]).transform(lambda x: x.rank(ascending=False))
Out[14]: 
                                    r
                                    c
z x                         y        
1 2016-11-01 00:00:00+00:00 3121  1.0
                            3923  1.0
                            3953  1.0
                            4880  1.0

you need to use .transform. .agg is by definition a reducer.

this is duplicate of #11759.

I think there is bug somewhere in there. if you want to dig would be appreciated.

@jreback jreback closed this as completed Nov 25, 2016
@jreback jreback added Duplicate Report Duplicate issue or pull request Groupby labels Nov 25, 2016
@jreback jreback added this to the No action milestone Nov 25, 2016
@jreback jreback modified the milestones: Next Major Release, No action Nov 25, 2016
@jreback
Copy link
Contributor

jreback commented Nov 25, 2016

actually, going to reopen this one. The other issue is fixed I think. This is slightly different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Bug Groupby
Projects
None yet
Development

No branches or pull requests

4 participants