Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.4k views
in Technique[技术] by (71.8m points)

dataframe - Merge two data frames based on common column values in Pandas

How to get merged data frame from two data frames having common column value such that only those rows make merged data frame having common value in a particular column.

I have 5000 rows of df1 as format : -

    director_name   actor_1_name    actor_2_name    actor_3_name    movie_title
0   James Cameron   CCH Pounder Joel David Moore    Wes Studi     Avatar
1   Gore Verbinski  Johnny Depp Orlando Bloom   Jack Davenport   Pirates 
    of the Caribbean: At World's End
2   Sam Mendes   Christoph Waltz    Rory Kinnear    Stephanie Sigman Spectre

and 10000 rows of df2 as

movieId                   genres                        movie_title
    1       Adventure|Animation|Children|Comedy|Fantasy   Toy Story
    2       Adventure|Children|Fantasy                    Jumanji
    3       Comedy|Romance                             Grumpier Old Men
    4       Comedy|Drama|Romance                      Waiting to Exhale

A common column 'movie_title' have common values and based on them, I want to get all rows where 'movie_title' is same. Other rows to be deleted.

Any help/suggestion would be appreciated.

Note: I already tried

pd.merge(dfinal, df1, on='movie_title')

and output comes like one row

director_name   actor_1_name    actor_2_name    actor_3_name    movie_title movieId title   genres

and on how ="outer"/"left", "right", I tried all and didn't get any row after dropping NaN although many common coloumn do exist.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use pd.merge:

import pandas as pd
pd.merge(df1, df2, on="movie_title")

Only rows are kept for which common keys are found in both data frames. In case you want to keep all rows from the left data frame and only add values from df2 where a matching key is available, you can use how="left".


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...