Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
158 views
in Technique[技术] by (71.8m points)

python - Update value for every row based on either of two previous columns

I am researching ATP Tour male tennis data. Currently, I have a Pandas dataframe that contains ~60,000 matches. Every row contains information / statistics about the match, split between the winner and the loser. I have sorted the dataframe on date. Currently I am trying to calculate the ELO-rating of both the winner and the loser for every match (thus every row). To calculate the ELO-rating, one needs the ELO-rating for both players in their previous match. Another difficulty arises, as the winner of the current match might have been a loser in his previous match. As a result, the 'winner_player_id' value of the current match might be in the 'loser_player_id' column for the previous match.

I am not sure how to efficiently select the previous ELO-ratings for both players per row, as this entails a search across multiple columns.

Every row includes the following columns:

array(['match_id', 'tourney_dates', 'round_order', 'tourney_name',
   'tourney_year_id', 'tourney_round_name', 'winner_player_id',
   'winner_slug', 'loser_player_id', 'loser_slug', 'elo_player_1', 'elo_player_2'])

Your time is appreciated!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

One approach would be to sort each winner and loser in each row by player name/ID, so the order will be stable regardless of who wins/loses. Here's an example:

df.join(pd.DataFrame(
    np.sort(df[['winner_name', 'loser_name']].values, axis=1),
    columns=['name1', 'name2']))

df.head(10)

Output:

      winner_name         loser_name              name1          name2
0   Nicklas Kulti      Michael Stich      Michael Stich  Nicklas Kulti
1   Michael Stich        Jim Courier        Jim Courier  Michael Stich
2   Nicklas Kulti     Magnus Larsson     Magnus Larsson  Nicklas Kulti
3     Jim Courier      Martin Sinner        Jim Courier  Martin Sinner
4   Michael Stich        Jimmy Arias        Jimmy Arias  Michael Stich
5   Nicklas Kulti    Fabrice Santoro    Fabrice Santoro  Nicklas Kulti
6  Magnus Larsson      Patrik Kuhnen     Magnus Larsson  Patrik Kuhnen
7     Jim Courier      Paul Haarhuis        Jim Courier  Paul Haarhuis
8   Nicklas Kulti  Magnus Gustafsson  Magnus Gustafsson  Nicklas Kulti
9   Michael Stich        Gilad Bloom        Gilad Bloom  Michael Stich

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...