Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
383 views
in Technique[技术] by (71.8m points)

python - return rows with unique pairs across columns

I'm trying to find rows that have unique pairs of values across 2 columns, so this dataframe:

A    B
1    0
2    0
3    0
0    1
2    1
3    1
0    2
1    2
3    2
0    3
1    3
2    3

will be reduced to only the rows that don't match up if flipped, for instance 1 and 3 is a combination I only want returned once. So a check to see if the same pair exists if the columns are flipped (3 and 1) it can be removed. The table I'm looking to get is:

A  B
0  2
0  3
1  0
1  2
1  3
2  3

Where there is only one occurrence of each pair of values that are mirrored if the columns are flipped.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think you can use apply sorted + drop_duplicates:

df = df.apply(sorted, axis=1).drop_duplicates()
print (df)
   A  B
0  0  1
1  0  2
2  0  3
4  1  2
5  1  3
8  2  3

Faster solution with numpy.sort:

df = pd.DataFrame(np.sort(df.values, axis=1), index=df.index, columns=df.columns)
      .drop_duplicates()
print (df)
   A  B
0  0  1
1  0  2
2  0  3
4  1  2
5  1  3
8  2  3

Solution without sorting with DataFrame.min and DataFrame.max:

a = df.min(axis=1)
b = df.max(axis=1)
df['A'] = a
df['B'] = b
df = df.drop_duplicates()
print (df)
   A  B
0  0  1
1  0  2
2  0  3
4  1  2
5  1  3
8  2  3

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...