Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
189 views
in Technique[技术] by (71.8m points)

python - Diff of two Dataframes

I need to compare two dataframes of different size row-wise and print out non matching rows. Lets take the following two:

df1 = DataFrame({
'Buyer': ['Carl', 'Carl', 'Carl'],
'Quantity': [18, 3, 5, ]})

df2 = DataFrame({
'Buyer': ['Carl', 'Mark', 'Carl', 'Carl'],
'Quantity': [2, 1, 18, 5]})

What is the most efficient way to row-wise over df2 and print out rows not in df1 e.g.

Buyer     Quantity 
Carl         2
Mark         1

Important: I do not want to have row:

Buyer     Quantity 
Carl         3

Included in the diff:

I have already tried: Comparing two dataframes of different length row by row and adding columns for each row with equal value and Compare two DataFrames and output their differences side-by-side

But these do not match with my problem.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

merge the 2 dfs using method 'outer' and pass param indicator=True this will tell you whether the rows are present in both/left only/right only, you can then filter the merged df after:

In [22]:
merged = df1.merge(df2, indicator=True, how='outer')
merged[merged['_merge'] == 'right_only']

Out[22]:
  Buyer  Quantity      _merge
3  Carl         2  right_only
4  Mark         1  right_only

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...