Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
353 views
in Technique[技术] by (71.8m points)

python - Ranking order per group in Pandas

Consider a dataframe with three columns: group_ID, item_ID and value. Say we have 10 itemIDs total.

I need to rank each item_ID (1 to 10) within each group_ID based on value, and then see the mean rank (and other stats) across groups (e.g. the IDs with the highest value across groups would get ranks closer to 1). How can I do this in Pandas?

This answer does something very close with qcut, but not exactly the same.


A data example would look like:

      group_ID   item_ID  value
0   0S00A1HZEy        AB     10
1   0S00A1HZEy        AY      4
2   0S00A1HZEy        AC     35
3   0S03jpFRaC        AY     90
4   0S03jpFRaC        A5      3
5   0S03jpFRaC        A3     10
6   0S03jpFRaC        A2      8
7   0S03jpFRaC        A4      9
8   0S03jpFRaC        A6      2
9   0S03jpFRaC        AX      0

which would result in:

      group_ID   item_ID   rank
0   0S00A1HZEy        AB      2
1   0S00A1HZEy        AY      3
2   0S00A1HZEy        AC      1
3   0S03jpFRaC        AY      1
4   0S03jpFRaC        A5      5
5   0S03jpFRaC        A3      2
6   0S03jpFRaC        A2      4
7   0S03jpFRaC        A4      3
8   0S03jpFRaC        A6      6
9   0S03jpFRaC        AX      7
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There are lots of different arguments you can pass to rank; it looks like you can use rank("dense", ascending=False) to get the results you want, after doing a groupby:

>>> df["rank"] = df.groupby("group_ID")["value"].rank("dense", ascending=False)
>>> df
     group_ID item_ID  value  rank
0  0S00A1HZEy      AB     10     2
1  0S00A1HZEy      AY      4     3
2  0S00A1HZEy      AC     35     1
3  0S03jpFRaS      AY     90     1
4  0S03jpFRaS      A5      3     5
5  0S03jpFRaS      A3     10     2
6  0S03jpFRaS      A2      8     4
7  0S03jpFRaS      A4      9     3
8  0S03jpFRaS      A6      2     6
9  0S03jpFRaS      AX      0     7

But note that if you're not using a global ranking scheme, finding out the mean rank across groups isn't very meaningful-- unless there are duplicate values in a group (and so you have duplicate rank values) all you're doing is measuring how many elements there are in a group.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...